Rate Limits

Understanding Rate Limits

Rate limits define the number of API requests that can be made within a specific period of time, helping optimize API usage.

Prevent API abuse and misuse
Ensure fair resource allocation
Maintain API performance and reliability
Protect service stability

Default Rate Limits

Each account has default rate limits when calling models, measured in RPM (requests per model per minute) and TPM (tokens per model per minute). Rate limits vary by account tier. See the table below for the specific criteria.

Quota Tier	Eligibility (USD)
T1	Highest total top-up amount in a single month over the last 3 calendar months < $50
T2	$50 ≤ Highest total top-up amount in a single month over the last 3 calendar months < $500
T3	$500 ≤ Highest total top-up amount in a single month over the last 3 calendar months < $3000
T4	$3000 ≤ Highest total top-up amount in a single month over the last 3 calendar months < $10000
T5	$10000 ≤ Highest total top-up amount in a single month over the last 3 calendar months

Default rate limits for each tier (RPM / TPM):

Avoiding Rate Limits

If the number of your API requests exceeds the rate limit, the API will return:

HTTP status code: 429 (Too Many Requests).
A message in the response body indicating that the rate limit has been exceeded.

To avoid triggering rate limits, you can take the following measures:

Implement request throttling in your application.
Use exponential backoff when retrying.
Monitor your API usage.

Handling 429 Errors

If you receive a 429 error, you can try the following:

Try again later: Wait for a period of time before retrying your request.
Optimize requests: Reduce the request frequency.
Increase rate limits: If you need higher rate limits, please contact us.

Getting Started

LLM API

Model Providers

Model Features

Third-party Tool Setup

Understanding Rate Limits

Default Rate Limits

Avoiding Rate Limits

Handling 429 Errors

​Understanding Rate Limits

​Default Rate Limits

​Avoiding Rate Limits

​Handling 429 Errors

Understanding Rate Limits

Default Rate Limits

Avoiding Rate Limits

Handling 429 Errors