What does a 429 error from OpenAI mean?

HTTP 429 is "Too Many Requests". OpenAI returns it when you exceed your rate limit, measured in requests per minute (RPM) or tokens per minute (TPM), or when you have run out of quota or credit. The response body names which limit you hit, so always read it before assuming it is just speed.

How do I fix a 429 caused by too many requests per minute?

Slow down and retry with exponential backoff: wait, then double the wait on each subsequent failure, with a little random jitter so many clients do not retry in lockstep. Most official SDKs do this automatically, but if you call the API yourself you have to implement it. Honour the Retry-After header when it is present.

Why do I get a 429 even though I am barely making any requests?

That is usually a quota or billing problem, not a speed problem. A 429 with an "insufficient_quota" error means your account is out of credit or has hit its spending limit, not that you are sending too fast. Check your usage and billing in the OpenAI dashboard, and add credit or raise the limit.

How do I raise my OpenAI rate limit?

Rate limits scale with your usage tier, which rises automatically as your account spends more over time. You cannot set them by hand on the lower tiers, so the levers are: add a payment method, let your tier mature, and in the meantime batch requests and reduce token usage to stay under the cap.

Fix the OpenAI API rate limit (429) error

About the 429 error
Why do I see this error
Fix a speed-related 429
Reduce how much you send
Fix a quota-related 429
Raise your rate limit

About the 429 error

429 Too Many Requests is the OpenAI API telling you it's refusing a request because you've gone over a limit. It's the AI-API equivalent of any other HTTP rate-limit response — the request was well-formed, the server just won't process it right now.

There are two very different causes hiding behind the same status code, and fixing the wrong one wastes time.

Why do I see this error

Read the JSON body first — it names the real cause:

Rate limit exceeded — you sent requests faster than your account's requests per minute (RPM) or tokens per minute (TPM) allows. This is a speed problem.
insufficient_quota — your account is out of credit or has hit its spending limit. This is a billing problem, and slowing down won't help.

A 429 that appears even though you're barely sending any requests is almost always the second kind.

Fix a speed-related 429

The correct response to a rate limit is to back off and retry, not to hammer the endpoint. Use exponential backoff with jitter: wait a moment, and double the wait on each retry, with a little randomness so multiple clients don't all retry at the same instant.

import time, random
from openai import OpenAI, RateLimitError

client = OpenAI()

def chat_with_retry(messages, retries=5):
    for attempt in range(retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o-mini", messages=messages
            )
        except RateLimitError:
            if attempt == retries - 1:
                raise
            wait = (2 ** attempt) + random.random()
            time.sleep(wait)

The official SDKs already retry with backoff internally, so if you're using one, raise its retry count rather than writing your own loop. If you call the API directly over HTTP, honour the Retry-After header when it's present — it tells you exactly how long to wait. The same pattern applies in PHP; see use the OpenAI API in Laravel.

Reduce how much you send

Backoff treats the symptom. To stop hitting the limit at all, lower your request and token rate:

Batch multiple items into one request instead of one call each.
Cache repeated or identical prompts so you don't pay to ask twice.
Cap max_tokens so a runaway response can't burn your TPM budget.
Pick a smaller model — it has higher rate limits and costs less per token.

Fix a quota-related 429

If the body says insufficient_quota, no amount of backoff will help. Open your OpenAI dashboard and:

Check your remaining credit and add a payment method if needed.
Review your monthly spending limit — you may have hit a cap you set yourself.
Confirm you're using the right API key for the right project.

Raise your rate limit

Rate limits scale with your usage tier, which increases automatically as your account spends more over time. You can't set the numbers by hand on the lower tiers, so the only durable fix is to let the tier mature while keeping usage efficient with the batching and caching above. If a fixed, predictable limit matters more to you than scaling, running a model on your own server removes the per-minute cap entirely.

Knowledge

Fix the OpenAI API rate limit (429) error

#AI

About the 429 error

Why do I see this error

Fix a speed-related 429

Reduce how much you send

Fix a quota-related 429

Raise your rate limit

Subscribe to our newsletter

Frequently asked questions

More in #AI

Knowledge

Fix the OpenAI API rate limit (429) error

#AI

#About the 429 error

#Why do I see this error

#Fix a speed-related 429

#Reduce how much you send

#Fix a quota-related 429

#Raise your rate limit

Subscribe to our newsletter

Frequently asked questions

More in #AI

About the 429 error

Why do I see this error

Fix a speed-related 429

Reduce how much you send

Fix a quota-related 429

Raise your rate limit