Skip to main content

API Rate Limit Cheatsheet and 429 Header Interpreter

Search rate-limit headers, retry rules, provider quirks, and paste 429 headers to decode the next safe retry

  • Runs locally
  • Category Developer & DevOps
  • Best for Formatting, validating, shrinking, or inspecting code-adjacent text.

Header / error interpreter

Paste rate-limit headers or a JSON error to inspect retry timing. The reference below remains searchable.
Remaining
-
Retry after
-
Reset
-
Status
-
Paste rate-limit headers or a JSON error to inspect retry timing. The reference below remains searchable.

Searchable reference

Headers

Retry-After

The strongest client hint: wait this many seconds, or until this HTTP date, before retrying.

Appears most often with 429 Too Many Requests and sometimes with 503 Service Unavailable. If it is present, a client should prefer it over guessed exponential backoff timing.

Retry-After: 60
Headers

RateLimit-Limit

Standard quota ceiling for the current window, usually a request count like 100 or a policy item like 100;w=60.

Use it to size queues and progress bars, not to decide retry timing by itself. Pair it with RateLimit-Remaining and RateLimit-Reset.

RateLimit-Limit: 100;w=60
Headers

RateLimit-Remaining

How many requests or cost units are left in the current window.

Zero does not always mean the request failed; it can mean this request consumed the last unit. Check the HTTP status and Retry-After before retrying.

RateLimit-Remaining: 0
Headers

RateLimit-Reset

Standard reset delay, in seconds, until the current quota window refreshes.

Do not confuse it with legacy X-RateLimit-Reset, which many APIs use as a Unix epoch timestamp. The standard field is a delay.

RateLimit-Reset: 60
Headers

RateLimit-Policy

A compact policy description such as 100;w=60 or multiple windows separated by commas.

Useful when the API exposes both short burst windows and long daily caps. Clients should plan against the tightest relevant window.

RateLimit-Policy: 100;w=60, 5000;w=3600
Headers

X-RateLimit-Reset

Legacy provider header, often a Unix epoch timestamp rather than a seconds delay.

GitHub-style APIs commonly return an epoch value here. Treat 1717171717 as a timestamp, but treat standard RateLimit-Reset: 60 as a delay.

X-RateLimit-Reset: 1717171717
Client patterns

Exponential backoff with jitter

Retry with growing delays, then randomize each wait so many clients do not retry at the same instant.

If Retry-After exists, obey it first. If it does not, use capped exponential backoff with full jitter and stop after a bounded number of attempts.

delay = random(0, min(cap, base * 2 ** attempt))
Client patterns

Idempotency key

A client-generated key that makes safe retries possible for payment, order, and mutation endpoints.

Use it for POST requests that create something. Without it, a retry after a timeout can double-create or double-charge.

Idempotency-Key: 5f2c3d1f-8f4b-4a42-9f2b-1dd8d6b7e1aa
Client patterns

Client-side throttle queue

Queue outbound calls locally and release them at a known safe rate instead of waiting for 429s.

This is the best fit for batch jobs and sync workers. Keep concurrency and per-window quotas separate; they fail differently.

maxConcurrent = 4; minIntervalMs = 250
Client patterns

Batching and caching

Reduce request count before tuning retry logic: batch reads, cache GETs, and coalesce duplicate in-flight calls.

The fastest 429 fix is often removing unnecessary calls. Cache immutable metadata and collapse duplicate fetches during page load.

const cached = memoize(fetchPlanLimits, { ttl: 60000 })
Server patterns

Token bucket

Tokens refill at a steady rate; each request spends tokens. Supports bursts up to bucket capacity.

Good default for public APIs because it allows short bursts while keeping the long-term average stable.

capacity=100, refill=10 tokens/second
Server patterns

Sliding window

Counts requests over a rolling time range, reducing the boundary spike of fixed windows.

More accurate than a fixed window but usually needs more state. Often implemented with counters plus interpolation.

effective = current + previous * overlapRatio
Server patterns

Cost-based limits

Charge expensive API operations more than cheap ones instead of counting every request as 1.

Common for GraphQL, search, AI, and report exports. Expose remaining cost units clearly or clients cannot plan.

RateLimit-Remaining: 870  # cost units
Server patterns

Concurrency limit

Caps simultaneous in-flight requests, which is separate from requests per minute.

A client can be under its hourly quota and still get rejected for opening too many slow requests at once.

429 with message: too many concurrent requests
Provider notes

GitHub style

X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset where reset is Unix epoch seconds.

A classic legacy header set. The reset value is absolute time, so convert it before showing a countdown.

X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1717171717
Provider notes

Stripe style

Stripe may include Stripe-Rate-Limited-Reason to distinguish global, endpoint, resource, and concurrency limits.

That reason determines the fix: lower global throughput, isolate a hot endpoint, or reduce concurrent mutations on the same object.

Stripe-Rate-Limited-Reason: endpoint-rate
Provider notes

AI API style

AI APIs often limit both requests per minute and tokens per minute; retry timing must respect the tighter one.

A request can fail because token throughput is exhausted even when request count looks fine. Log both dimensions.

x-ratelimit-remaining-tokens: 0
Pitfalls

Remaining is not the retry delay

RateLimit-Remaining tells capacity, not when to retry. Use Retry-After or reset timing for sleep.

Sleeping for the remaining count is a common bug: remaining=0 does not mean sleep 0 seconds.

if (retryAfter) sleep(retryAfter)
Pitfalls

Local clock drift

Epoch reset timestamps depend on your local clock when you convert them to countdowns.

Prefer Retry-After when available. If only epoch reset exists, clamp negative delays to zero and log the raw timestamp.

delay = Math.max(0, resetEpochMs - Date.now())
Pitfalls

Shared quota key

The key used for limiting may be user, token, app, IP, workspace, or endpoint; guessing wrong hides the real hot spot.

Always log the limiter dimension alongside the route. A single user and a whole app need different fixes.

limit_key = workspace_id + ":" + route_group

What this tool does

A dense browser-only API rate limit cheatsheet for developers debugging 429 responses and quota exhaustion. Search the reference for Retry-After, RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, RateLimit-Policy, X-RateLimit legacy headers, token bucket, sliding windows, jittered backoff, client-side queues, concurrency caps, idempotency keys, GitHub style epoch resets, Stripe rate-limited reasons, and AI API token-per- minute limits. Paste raw HTTP response headers or a JSON error body and the interpreter extracts retry timing, remaining quota, reset delay, policy text, source fields, and warnings such as malformed JSON, 429 without retry timing, or impossible remaining-over-limit values. Nothing is uploaded, huge pasted responses are rejected before parsing, and the copied report gives you a clean incident note for a ticket, runbook, or Slack thread.

Tool details

Input
Text + Numbers
The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output
Live result + Copy
The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy
Browser-side processing
The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share
Shareable URL state
Key settings are encoded in the URL so another person can reopen the same setup.
Performance budget
Initial JS <= 18 KB
No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit
Developer & DevOps · Developer
Category and role tags drive related tools, internal links, and quick fit checks.

How to use

  1. 1. Input

    Paste or drop your content into the tool panel.

  2. 2. Process

    Click the button. All processing is local in your browser.

  3. 3. Copy / Download

    Copy the result or download to disk in one click.

How API Rate Limit Cheatsheet fits into your work

Use it in the small gaps between coding, reviewing, debugging, and shipping.

Developer jobs

  • Formatting, validating, shrinking, or inspecting code-adjacent text.
  • Preparing snippets for documentation, tickets, commits, or handoff.
  • Checking a small payload quickly without switching tools.

Developer checks

  • Run irreversible transforms like minify or obfuscate on a copy.
  • Keep secrets out of pasted snippets unless the tool explicitly stays local.
  • Use your normal tests or linter before shipping transformed code.

Good next steps

These links move the current task into a more complete workflow.

  1. 1 HTTP Status Code Reference Every HTTP status code 1xx to 5xx with meaning, when-to-use and RFC, search by code or keyword, browser-only Open
  2. 2 HTTP Header Parser Paste raw HTTP headers, get a clean table with per-header meaning, duplicate and security badges, browser-only Open
  3. 3 curl Cheatsheet curl cheat sheet — 80+ curl commands for GET/POST/auth/upload/download/SSL/proxy, with real examples and pitfalls. Open

Real-world use cases

  • Decode a production 429 response

    Paste the response headers from a failed job, read the Retry-After value as seconds or an HTTP date, and copy a compact incident note that includes remaining quota, reset delay, and the fields that were actually present. This removes guesswork from "when can we retry?" conversations during an outage.

  • Choose a safe retry policy for a client

    Search "jitter" or "idempotency" while implementing a sync worker. The reference keeps Retry-After, capped exponential backoff, local queues, batching, and idempotency keys in one place, so a client can retry politely without double-creating records.

  • Compare standard and legacy headers

    When an API returns both RateLimit-* and X-RateLimit-* fields, the interpreter helps separate standard reset delays from legacy epoch timestamps. That is the common mistake that turns a one-minute wait into an absurd sleep or an immediate retry storm.

Common pitfalls

  • Treating RateLimit-Remaining as a sleep duration. Remaining is capacity, not time; retry timing comes from Retry-After or reset.

  • Assuming every reset header is seconds. Standard RateLimit-Reset is a delay, but many X-RateLimit-Reset values are Unix epoch seconds.

  • Retrying non-idempotent POST requests after a timeout without an idempotency key, which can double-charge or double-create.

Privacy

The pasted response is parsed in the browser only. The tool stores the safe search query and category in the URL for sharing, but does not put pasted headers or JSON errors in the URL and does not save them locally.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Made by Toolora · 100% client-side · Updated 2026-06-13