ElevenLabs Text-to-Speech Turbo V2.5

curl --request POST \
  --url https://api.highwayapi.ai/v3/elevenlabs-tts-turbo-v2.5 \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "text": "<string>",
  "voice_id": "<string>",
  "next_text": "<string>",
  "language_code": "<string>",
  "output_format": "<string>",
  "previous_text": "<string>",
  "use_pvc_as_ivc": true,
  "voice_settings": {
    "speed": 123,
    "style": 123,
    "stability": 123,
    "similarity_boost": 123,
    "use_speaker_boost": true
  },
  "next_request_ids": [
    {}
  ],
  "previous_request_ids": [
    {}
  ],
  "apply_text_normalization": "<string>",
  "apply_language_text_normalization": true,
  "pronunciation_dictionary_locators": [
    {
      "version_id": "<string>",
      "pronunciation_dictionary_id": "<string>"
    }
  ]
}
'

POST

elevenlabs-tts-turbo-v2.5

curl --request POST \
  --url https://api.highwayapi.ai/v3/elevenlabs-tts-turbo-v2.5 \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "text": "<string>",
  "voice_id": "<string>",
  "next_text": "<string>",
  "language_code": "<string>",
  "output_format": "<string>",
  "previous_text": "<string>",
  "use_pvc_as_ivc": true,
  "voice_settings": {
    "speed": 123,
    "style": 123,
    "stability": 123,
    "similarity_boost": 123,
    "use_speaker_boost": true
  },
  "next_request_ids": [
    {}
  ],
  "previous_request_ids": [
    {}
  ],
  "apply_text_normalization": "<string>",
  "apply_language_text_normalization": true,
  "pronunciation_dictionary_locators": [
    {
      "version_id": "<string>",
      "pronunciation_dictionary_id": "<string>"
    }
  ]
}
'

Convert text into speech using the voice of your choice and return the audio.

Request Headers

Content-Type

string

required

Enum value: application/json

Authorization

string

required

Bearer authentication format: Bearer {{API Key}}.

Request Body

seed

integer

If specified, the system will try to sample deterministically. Repeated requests with the same seed and parameters should return the same result, but full determinism is not guaranteed.Value range: [0, 4294967295]

text

string

required

The text to convert into speech.

voice_id

string

required

The voice ID to use.

next_text

string

Text that follows the current request text. Used to improve speech continuity when stitching together multiple generations.

language_code

string

Language code (ISO 639-1) used for the model and text normalization. If the model does not support this language code, an error will be returned.

output_format

string

default:"mp3_44100_128"

The output format of the generated audio. The format is codec_sample_rate_bitrate. The 192 kbps bitrate for MP3 requires a Creator account or above, and the 44.1 kHz sample rate for PCM requires a Pro account or above.Available values: mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, pcm_8000, pcm_16000, pcm_22050, pcm_24000, pcm_32000, pcm_44100, pcm_48000, ulaw_8000, alaw_8000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192

previous_text

string

Text that precedes the current request text. Used to improve speech continuity when stitching together multiple generations.

use_pvc_as_ivc

boolean

default:false

If true, use the IVC version of the voice instead of the PVC version. This is a temporary workaround for the higher latency of the PVC version.

voice_settings

object

Hide properties

speed

number

default:1

Adjusts the speed of the voice. 1.0 is the default speed; values below 1.0 slow it down, and values above 1.0 speed it up.

style

number

default:0

Determines the intensity of the speaking style. Attempts to amplify the original speaker’s style. Setting this to a non-zero value consumes more compute resources and may increase latency.

stability

number

Determines the stability of speech generation and the randomness between generations. Lower values provide a wider emotional range, while higher values may result in monotonous speech.

similarity_boost

number

Determines how closely the AI attempts to replicate the original voice.

use_speaker_boost

boolean

default:true

Enhances similarity to the original speaker. Requires slightly higher computational load and increases latency.

next_request_ids

array

A list of request_id values for subsequent samples. Used to maintain speech continuity when regenerating samples. Up to 3 request_id values can be provided.Array length: 0 - 3

previous_request_ids

array

A list of request_id values for samples generated before the current generation. Can be used to improve speech continuity. Up to 3 request_id values can be provided.Array length: 0 - 3

apply_text_normalization

string

default:"auto"

Controls text normalization. ‘auto’ lets the system decide, ‘on’ always normalizes, and ‘off’ skips it.Available values: auto, on, off

apply_language_text_normalization

boolean

default:false

Controls language-specific text normalization for certain supported languages to achieve more natural pronunciation. Warning: This may significantly increase latency. Currently, only Japanese is supported.

pronunciation_dictionary_locators

array

A list of pronunciation dictionary locators (id, version_id) to apply to the text. Applied in order. Up to 3 locators are allowed per request.Array length: 0 - 3

Hide properties

version_id

string

The ID of the pronunciation dictionary version. If not specified, the latest version is used.

pronunciation_dictionary_id

string

required

The ID of the pronunciation dictionary.

Response Information

Generated audio file Format: binary

ElevenLabs Text to Speech Multilingual V2

ElevenLabs Text-to-Speech Turbo v2

​Request Headers

​Request Body

​Response Information

Request Headers

Request Body

Response Information