Skip to main content
POST
/
v3
/
async
/
minimax-speech-2.6-hd
MiniMax Speech-2.6-hd Asynchronous Speech Synthesis
curl --request POST \
  --url https://api.highwayapi.ai/v3/async/minimax-speech-2.6-hd \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "text": "<string>",
  "voice_setting": {
    "speed": 123,
    "vol": 123,
    "pitch": 123,
    "voice_id": "<string>",
    "emotion": "<string>",
    "text_normalization": true
  },
  "audio_setting": {
    "sample_rate": 123,
    "bitrate": 123,
    "format": "<string>",
    "channel": 123
  },
  "pronunciation_dict": {
    "tone": [
      {}
    ]
  },
  "language_boost": "<string>",
  "voice_modify": {
    "pitch": 123,
    "intensity": 123,
    "timbre": 123,
    "sound_effects": "<string>"
  }
}
'
{
  "task_id": "<string>"
}
This API supports asynchronous text-to-speech generation. A single text generation request supports up to 1 million characters for transmission, and the complete generated audio result can be retrieved asynchronously. It supports 100+ system voices and custom cloned voices; it also supports custom adjustment of intonation, speed, volume, bitrate, sample rate, and output format. After submitting a long-text speech synthesis request, note that the returned url is valid for 24 hours from the time the url is returned. Please download the information in time.
Suitable for generating speech from long texts such as entire books. Task queueing may take a relatively long time. For scenarios such as short sentence generation, voice chat, and online social networking, we recommend using synchronous speech synthesis.

Request Headers

Content-Type
string
required
Enum value: application/json
Authorization
string
required
Bearer authentication format: Bearer {{API key}}.

Request Body

text
string
required
The text to synthesize, with a maximum length of 50,000 characters.
voice_setting
object
required
audio_setting
object
pronunciation_dict
object
language_boost
string
default:"null"
Enhances recognition capability for specified minority languages and dialects. After setting this parameter, speech performance can be improved in the specified minority language/dialect scenarios. If the minority language type is unclear, you can select “auto”, and the model will determine the minority language type automatically. The following values are supported:'Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'Bulgarian', 'Danish', 'Hebrew', 'Malay', 'Persian', 'Slovak', 'Swedish', 'Croatian', 'Filipino', 'Hungarian', 'Norwegian', 'Slovenian', 'Catalan', 'Nynorsk', 'Tamil', 'Afrikaans', 'auto'
voice_modify
object
Voice effects settings. Supported audio formats for this parameter: mp3, wav, flac

Response Parameters

task_id
string
required
The task_id of the asynchronous task. You should use this task_id to request the Get Task Result API to obtain the generated result.