What should I do first when an API returns 429?

Read Retry-After first if it is present, because it is the server's explicit instruction about when the next attempt is welcome. If Retry-After is missing, look for RateLimit-Reset or X-RateLimit-Reset and convert that into a delay, then retry with capped exponential backoff plus jitter instead of hammering the endpoint.

Is RateLimit-Reset the same as X-RateLimit-Reset?

Not always. The standard RateLimit-Reset header is commonly a delay in seconds until the quota window refreshes, while many legacy X-RateLimit-Reset headers use Unix epoch seconds. This tool treats X-RateLimit-Reset as epoch-style and standard RateLimit-Reset as a delay so you do not sleep for 1.7 billion seconds by mistake.

Why do I still get limited when RateLimit-Remaining is above zero?

Remaining usually describes one quota window, not every limiter protecting the API. You may also be hitting a concurrency cap, a per-endpoint cap, a token-per-minute cap, or a workspace-level shared quota. Log the limiter dimension and check provider-specific reason headers before assuming the remaining counter is wrong.

Does this tool send my API headers or error payload anywhere?

No. The parser runs entirely in the browser against the text you paste. The pasted header/error body is intentionally not placed in the URL and is not saved to localStorage, because real production responses often contain route names, account ids, or accidental credentials.

Decode a production 429 response

Paste the response headers from a failed job, read the Retry-After value as seconds or an HTTP date, and copy a compact incident note that includes remaining quota, reset delay, and the fields that were actually present. This removes guesswork from "when can we retry?" conversations during an outage.

Choose a safe retry policy for a client

Search "jitter" or "idempotency" while implementing a sync worker. The reference keeps Retry-After, capped exponential backoff, local queues, batching, and idempotency keys in one place, so a client can retry politely without double-creating records.

Compare standard and legacy headers

When an API returns both RateLimit-* and X-RateLimit-* fields, the interpreter helps separate standard reset delays from legacy epoch timestamps. That is the common mistake that turns a one-minute wait into an absurd sleep or an immediate retry storm.

API Rate Limit Cheatsheet and 429 Header Interpreter

Search rate-limit headers, retry rules, provider quirks, and paste 429 headers to decode the next safe retry

Runs locally
Category Developer & DevOps
Best for Formatting, validating, shrinking, or inspecting code-adjacent text.

Header / error interpreter

Paste rate-limit headers or a JSON error to inspect retry timing. The reference below remains searchable.

Remaining

Retry after

Reset

Status

Paste rate-limit headers or a JSON error to inspect retry timing. The reference below remains searchable.

Searchable reference

Headers

Retry-After

The strongest client hint: wait this many seconds, or until this HTTP date, before retrying.

Appears most often with 429 Too Many Requests and sometimes with 503 Service Unavailable. If it is present, a client should prefer it over guessed exponential backoff timing.

Retry-After: 60

Headers

RateLimit-Limit

Standard quota ceiling for the current window, usually a request count like 100 or a policy item like 100;w=60.

Use it to size queues and progress bars, not to decide retry timing by itself. Pair it with RateLimit-Remaining and RateLimit-Reset.

RateLimit-Limit: 100;w=60

Headers

RateLimit-Remaining

How many requests or cost units are left in the current window.

Zero does not always mean the request failed; it can mean this request consumed the last unit. Check the HTTP status and Retry-After before retrying.

RateLimit-Remaining: 0

Headers

RateLimit-Reset

Standard reset delay, in seconds, until the current quota window refreshes.

Do not confuse it with legacy X-RateLimit-Reset, which many APIs use as a Unix epoch timestamp. The standard field is a delay.

RateLimit-Reset: 60

Headers

RateLimit-Policy

A compact policy description such as 100;w=60 or multiple windows separated by commas.

Useful when the API exposes both short burst windows and long daily caps. Clients should plan against the tightest relevant window.

RateLimit-Policy: 100;w=60, 5000;w=3600

Headers

X-RateLimit-Reset

Legacy provider header, often a Unix epoch timestamp rather than a seconds delay.

GitHub-style APIs commonly return an epoch value here. Treat 1717171717 as a timestamp, but treat standard RateLimit-Reset: 60 as a delay.

X-RateLimit-Reset: 1717171717

Client patterns

Exponential backoff with jitter

Retry with growing delays, then randomize each wait so many clients do not retry at the same instant.

If Retry-After exists, obey it first. If it does not, use capped exponential backoff with full jitter and stop after a bounded number of attempts.

delay = random(0, min(cap, base * 2 ** attempt))

Client patterns

Idempotency key

A client-generated key that makes safe retries possible for payment, order, and mutation endpoints.

Use it for POST requests that create something. Without it, a retry after a timeout can double-create or double-charge.

Idempotency-Key: 5f2c3d1f-8f4b-4a42-9f2b-1dd8d6b7e1aa

Client patterns

Client-side throttle queue

Queue outbound calls locally and release them at a known safe rate instead of waiting for 429s.

This is the best fit for batch jobs and sync workers. Keep concurrency and per-window quotas separate; they fail differently.

maxConcurrent = 4; minIntervalMs = 250

Client patterns

Batching and caching

Reduce request count before tuning retry logic: batch reads, cache GETs, and coalesce duplicate in-flight calls.

The fastest 429 fix is often removing unnecessary calls. Cache immutable metadata and collapse duplicate fetches during page load.

const cached = memoize(fetchPlanLimits, { ttl: 60000 })

Server patterns

Token bucket

Tokens refill at a steady rate; each request spends tokens. Supports bursts up to bucket capacity.

Good default for public APIs because it allows short bursts while keeping the long-term average stable.

capacity=100, refill=10 tokens/second

Server patterns

Sliding window

Counts requests over a rolling time range, reducing the boundary spike of fixed windows.

More accurate than a fixed window but usually needs more state. Often implemented with counters plus interpolation.

effective = current + previous * overlapRatio

Server patterns

Cost-based limits

Charge expensive API operations more than cheap ones instead of counting every request as 1.

Common for GraphQL, search, AI, and report exports. Expose remaining cost units clearly or clients cannot plan.

RateLimit-Remaining: 870  # cost units

Server patterns

Concurrency limit

Caps simultaneous in-flight requests, which is separate from requests per minute.

A client can be under its hourly quota and still get rejected for opening too many slow requests at once.

429 with message: too many concurrent requests

Provider notes

GitHub style

X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset where reset is Unix epoch seconds.

A classic legacy header set. The reset value is absolute time, so convert it before showing a countdown.

X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1717171717

Provider notes

Stripe style

Stripe may include Stripe-Rate-Limited-Reason to distinguish global, endpoint, resource, and concurrency limits.

That reason determines the fix: lower global throughput, isolate a hot endpoint, or reduce concurrent mutations on the same object.

Stripe-Rate-Limited-Reason: endpoint-rate

Provider notes

AI API style

AI APIs often limit both requests per minute and tokens per minute; retry timing must respect the tighter one.

A request can fail because token throughput is exhausted even when request count looks fine. Log both dimensions.

x-ratelimit-remaining-tokens: 0

Pitfalls

Remaining is not the retry delay

RateLimit-Remaining tells capacity, not when to retry. Use Retry-After or reset timing for sleep.

Sleeping for the remaining count is a common bug: remaining=0 does not mean sleep 0 seconds.

if (retryAfter) sleep(retryAfter)

Pitfalls

Local clock drift

Epoch reset timestamps depend on your local clock when you convert them to countdowns.

Prefer Retry-After when available. If only epoch reset exists, clamp negative delays to zero and log the raw timestamp.

delay = Math.max(0, resetEpochMs - Date.now())

Pitfalls

Shared quota key

The key used for limiting may be user, token, app, IP, workspace, or endpoint; guessing wrong hides the real hot spot.

Always log the limiter dimension alongside the route. A single user and a whole app need different fixes.

limit_key = workspace_id + ":" + route_group

What this tool does

A dense browser-only API rate limit cheatsheet for developers debugging 429 responses and quota exhaustion. Search the reference for Retry-After, RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, RateLimit-Policy, X-RateLimit legacy headers, token bucket, sliding windows, jittered backoff, client-side queues, concurrency caps, idempotency keys, GitHub style epoch resets, Stripe rate-limited reasons, and AI API token-per- minute limits. Paste raw HTTP response headers or a JSON error body and the interpreter extracts retry timing, remaining quota, reset delay, policy text, source fields, and warnings such as malformed JSON, 429 without retry timing, or impossible remaining-over-limit values. Nothing is uploaded, huge pasted responses are rejected before parsing, and the copied report gives you a clean incident note for a ticket, runbook, or Slack thread.

Tool details

Input: Text + Numbers; The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output: Live result + Copy; The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy: Browser-side processing; The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share: Shareable URL state; Key settings are encoded in the URL so another person can reopen the same setup.
Performance budget: Initial JS <= 18 KB; No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit: Developer & DevOps · Developer; Category and role tags drive related tools, internal links, and quick fit checks.

How to use

1. Input

Paste or drop your content into the tool panel.
2. Process

Click the button. All processing is local in your browser.
3. Copy / Download

Copy the result or download to disk in one click.

How API Rate Limit Cheatsheet fits into your work

Use it in the small gaps between coding, reviewing, debugging, and shipping.

Developer jobs

Formatting, validating, shrinking, or inspecting code-adjacent text.
Preparing snippets for documentation, tickets, commits, or handoff.
Checking a small payload quickly without switching tools.

Developer checks

Run irreversible transforms like minify or obfuscate on a copy.
Keep secrets out of pasted snippets unless the tool explicitly stays local.
Use your normal tests or linter before shipping transformed code.

Good next steps

These links move the current task into a more complete workflow.

Real-world use cases

Decode a production 429 response
Paste the response headers from a failed job, read the Retry-After value as seconds or an HTTP date, and copy a compact incident note that includes remaining quota, reset delay, and the fields that were actually present. This removes guesswork from "when can we retry?" conversations during an outage.
Choose a safe retry policy for a client
Search "jitter" or "idempotency" while implementing a sync worker. The reference keeps Retry-After, capped exponential backoff, local queues, batching, and idempotency keys in one place, so a client can retry politely without double-creating records.
Compare standard and legacy headers
When an API returns both RateLimit-* and X-RateLimit-* fields, the interpreter helps separate standard reset delays from legacy epoch timestamps. That is the common mistake that turns a one-minute wait into an absurd sleep or an immediate retry storm.

Common pitfalls

Treating RateLimit-Remaining as a sleep duration. Remaining is capacity, not time; retry timing comes from Retry-After or reset.
Assuming every reset header is seconds. Standard RateLimit-Reset is a delay, but many X-RateLimit-Reset values are Unix epoch seconds.
Retrying non-idempotent POST requests after a timeout without an idempotency key, which can double-charge or double-create.

Privacy

The pasted response is parsed in the browser only. The tool stores the safe search query and category in the URL for sharing, but does not put pasted headers or JSON errors in the URL and does not save them locally.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Developer

Browse all tools for this role

API Rate Limit Cheatsheet and 429 Header Interpreter

Searchable reference

Retry-After

RateLimit-Limit

RateLimit-Remaining

RateLimit-Reset

RateLimit-Policy

X-RateLimit-Reset

Exponential backoff with jitter

Idempotency key

Client-side throttle queue

Batching and caching

Token bucket

Sliding window

Cost-based limits

Concurrency limit

GitHub style

Stripe style

AI API style

Remaining is not the retry delay

Local clock drift

Shared quota key

What this tool does

Tool details

How to use

1. Input

2. Process

3. Copy / Download

How API Rate Limit Cheatsheet fits into your work

Developer jobs

Developer checks

Good next steps

Real-world use cases

Decode a production 429 response

Choose a safe retry policy for a client

Compare standard and legacy headers

Common pitfalls

Privacy

FAQ

HTTP Status Code Reference

HTTP Header Parser

curl Cheatsheet

Cache-Control Builder

CORS Header Generator

JSON Formatter & Validator

AI Eval Planner

Apache Cheatsheet

API Key Generator

App Store Keywords Checklist

ASCII Table Generator

ASCII Table Reference