Audio Processing

Text to Speech

Convert text into provider-backed speech, adjust playback characteristics, and save reusable audio generations. The exact voice catalog is dynamic and depends on the connected backend plus your current plan tier.

Current Access Rules

Cost

1 credit per synthesis

Catalog

Voice availability depends on provider and plan tier

History

Successful generations are saved to TTS history

How It Works

Paste or type your script into the editor, choose a language and voice from the currently available catalog, adjust rate, pitch, and volume, then synthesize. The server action validates the request, checks your plan-based voice access, deducts credits, and stores successful output in history.

Dynamic Voice Catalog

The voice picker is loaded from the connected provider rather than a hard-coded list, so the available voices can change over time.

Playback Controls

Adjust speed, pitch, and volume before synthesis, then load prior generations back into the editor for a quick retry.

Saved Output

Successful generations are uploaded and stored so you can replay, download, or delete them later from history.

Voice Access by Plan

Free

Standard voice subset. The app intentionally limits the catalog to a smaller non-multilingual selection.

Pro

HD voice catalog unlocked, but multilingual ultra voices remain gated.

Elite

Full voice catalog unlocked, including multilingual ultra voices when the provider exposes them.

Earlier versions of this docs page showed a fixed voice catalog. The app no longer works that way; it fetches provider voices dynamically and filters them by tier.

Playback Controls

Control	Range	Description
Rate	Percent-based	Controls how quickly the generated voice speaks.
Pitch	Percent-based	Raises or lowers the perceived pitch before synthesis.
Volume	0-100%	Controls the generated audio level sent to the provider.
History reload	Saved generations	Lets you reopen a prior generation with its saved settings.

Language Support

The connected provider exposes the available locales. Sonic AI commonly surfaces support for these languages, but the exact set can expand, contract, or change names when the provider catalog changes.

EnglishArabic (MSA)Egyptian ArabicFrenchGermanSpanishItalianTurkishPortugueseHindiJapaneseChinese (Mandarin)KoreanProvider-dependent

TTS History

Successful syntheses are stored with the source text, selected voice, output file, and saved settings. You can replay history entries, download the result, delete them, or load their settings back into the editor.

Replay saved audioDownload outputDelete entryReload saved settings

Troubleshooting

Voice not available

The selected voice may be filtered out by your current plan tier. Free gets a standard subset, Pro gets HD voices, and Elite gets the full catalog.

No voices or languages loaded

The TTS page fetches the voice list from the connected provider. If that backend is unavailable, the selectors can appear empty or incomplete.

Synthesis failed

The current server action deducts 1 credit per synthesis attempt and can fail if the provider rejects the request or your account lacks credits.

Tips for Better Output

Write punctuation intentionally; commas and sentence length change pacing noticeably.
If you need a different voice family, check whether your current plan tier is hiding the option before assuming it is unsupported.
Use history reload when you want to tweak only one parameter instead of rebuilding the job from scratch.
Expect the catalog to evolve over time because it is fetched from the live provider rather than shipped with the UI.