Skip to main content
POST
/
v3
/
async
/
minimax-speech-2.5-turbo-preview
MiniMax Speech-2.5-turbo-preview Asynchronous Speech Synthesis
curl --request POST \
  --url https://api.highwayapi.ai/v3/async/minimax-speech-2.5-turbo-preview \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "text": "<string>",
  "voice_setting": {
    "speed": 123,
    "vol": 123,
    "pitch": 123,
    "voice_id": "<string>",
    "emotion": "<string>",
    "text_normalization": true
  },
  "audio_setting": {
    "sample_rate": 123,
    "bitrate": 123,
    "format": "<string>",
    "channel": 123
  },
  "pronunciation_dict": {
    "tone": [
      {}
    ]
  },
  "language_boost": "<string>",
  "voice_modify": {
    "pitch": 123,
    "intensity": 123,
    "timbre": 123,
    "sound_effects": "<string>"
  }
}
'
{
  "task_id": "<string>"
}
This API supports asynchronous text-to-speech generation. A single text generation transfer supports up to 1 million characters, and the complete generated audio result can be retrieved asynchronously. It supports 100+ system voices and cloned voices for independent selection; it also supports independent adjustment of intonation, speaking speed, volume, bitrate, sample rate, and output format. After submitting a long-text speech synthesis request, note that the returned url is valid for 24 hours from the time the url is returned. Please pay attention to the timing when downloading the information.
Suitable for speech generation from long texts such as entire books. Task queuing may take a long time. For scenarios such as short-sentence generation, voice chat, and online social networking, we recommend using synchronous speech synthesis calls.

Request Headers

Content-Type
string
required
Enum value: application/json
Authorization
string
required
Bearer authentication format: Bearer {{API Key}}.

Request Body

text
string
required
The text to synthesize, with a maximum length of 50,000 characters.
voice_setting
object
required
audio_setting
object
pronunciation_dict
object
language_boost
string
default:"null"
Enhances recognition capability for specified less common languages and dialects. After setting this, speech performance can be improved in the specified less common language/dialect scenario. If the less common language type is unclear, you can choose “auto”, and the model will determine the language type on its own. The following values are supported:'Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'Bulgarian', 'Danish', 'Hebrew', 'Malay', 'Persian', 'Slovak', 'Swedish', 'Croatian', 'Filipino', 'Hungarian', 'Norwegian', 'Slovenian', 'Catalan', 'Nynorsk', 'Tamil', 'Afrikaans', 'auto'
voice_modify
object
Voice effects processor settings. Supported audio formats for this parameter: mp3, wav, flac

Response Parameters

task_id
string
required
The task_id of the asynchronous task. You should use this task_id to request the Query Task Result API to obtain the generated result