ElevenLabs Text to Speech Flash V2.5
Audio
ElevenLabs Text to Speech Flash V2.5
POST
ElevenLabs Text to Speech Flash V2.5
Convert text to speech using the voice of your choice and return audio.
Request Headers
Enum value:
application/jsonBearer authentication format: Bearer {{API Key}}.
Request Body
If specified, the system will try to sample deterministically. Repeated requests with the same seed and parameters should return the same result, but full determinism is not guaranteed.Value range: [0, 4294967295]
The text to convert to speech.
Whether to enable Stream mode.
The voice ID to use.
The text after the text in the current request. Used to improve speech continuity when stitching together multiple generations.
The language code (ISO 639-1) used for the model and text normalization. If the model does not support this language code, an error will be returned.
The output format of the generated audio. The format is codec_sample_rate_bitrate. A 192 kbps bitrate for MP3 requires a Creator account or above; a 44.1 kHz sample rate for PCM requires a Pro account or above.Optional values:
mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, pcm_8000, pcm_16000, pcm_22050, pcm_24000, pcm_32000, pcm_44100, pcm_48000, ulaw_8000, alaw_8000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192The text before the text in the current request. Used to improve speech continuity when stitching together multiple generations.
If true, use the IVC version of the voice instead of the PVC version. This is a temporary workaround for the higher latency of the PVC version.
A list of request_id values for subsequent samples. Used to maintain speech continuity when regenerating samples. Up to 3 request_id values can be provided.Array length: 0 - 3
A list of request_id values for samples generated before the current generation. Can be used to improve speech continuity. Up to 3 request_id values can be provided.Array length: 0 - 3
Controls text normalization. ‘auto’ lets the system decide, ‘on’ always normalizes, and ‘off’ skips normalization.Optional values:
auto, on, offControls language-specific text normalization for certain supported languages to achieve more natural pronunciation. Warning: this may significantly increase latency. Currently only Japanese is supported.
A list of pronunciation dictionary locators (id, version_id) to apply to the text. They take effect in order. Each request can include up to 3 locators.Array length: 0 - 3
Response Information
Generated audio file Format:binary