Fish Audio Text-to-Speech
Audio
Fish Audio Text-to-Speech
POST
Fish Audio Text-to-Speech
For best results, we recommend using audio cloning to upload reference audio before using this API. This will improve speech quality and reduce latency.
-
WAV / PCM
- Sample rates: 8kHz, 16kHz, 24kHz, 32kHz, 44.1kHz
- Default sample rate: 44.1kHz
- 16-bit, mono
-
MP3
- Sample rates: 32kHz, 44.1kHz
- Default sample rate: 44.1kHz
- Mono
- Bitrates: 64kbps, 128kbps (default), 192kbps
-
Opus
- Sample rate: 48kHz
- Default sample rate: 48kHz
- Mono
- Bitrates: -1000 (automatic), 24kbps, 32kbps (default), 48kbps, 64kbps
Request Headers
Enum value:
application/jsonBearer authentication format: Bearer {{API Key}}.
Request Body
The text to convert to speech.
Controls the randomness of speech generation. Higher values (for example, 1.0) make the output more random, while lower values (for example, 0.1) make it more deterministic. We recommend using
0.9 for the s1 model.Required range: 0 <= x <= 1Controls diversity through nucleus sampling. Lower values (for example, 0.1) make the output more focused, while higher values (for example, 1.0) allow more diversity. We recommend using
0.9 for the s1 model.Required range: 0 <= x <= 1Reference audio for the voice. This requires MessagePack serialization and will override reference_voices and reference_texts.
The reference model ID for the voice.
Prosody control for the voice.
The chunk length for the voice.Required range:
100 <= x <= 300Whether to normalize the speech. This will reduce latency, but may reduce performance when handling numbers and dates.
The format for the speech.Optional values:
wav, pcm, mp3, opusThe sample rate for the speech.
The MP3 bitrate for the speech.Optional values:
64, 128, 192The Opus bitrate for the speech.Optional values:
-1000, 24, 32, 48, 64The latency setting for the speech.
balanced reduces latency but may cause performance degradation.Optional values: normal, balancedResponse Information
The API will directly return an audio stream in the format specified by theformat parameter (default: mp3).