Free API Rate Limit Calculator

API rate limiting caps how many requests a client can send in a given time window so backends stay healthy and abuse stays out. Enter your traffic profile to size per-user limits, burst capacity, token bucket parameters, and ready-to-paste config for nginx, Express, Cloudflare, and Redis.

Traffic profile

Recommended configuration

Per-user rate limit30 req per minute
Burst capacity (bucket size)90 req
Token refill rate0.5 tokens / sec
Steady throughput at peak500 req/sec
Burst throughput at peak1,500 req/sec
HTTP 429 Retry-After2 sec

Algorithm comparison for your inputs

The four most common rate-limiting algorithms - and when each is the right fit for your traffic shape.

AlgorithmBest forTradeoff
Token BucketSelectedAPIs that need to allow short bursts above the steady rate (e.g. dashboards, batch fetches).Slightly more state per client; tokens accumulate up to bucket size.
Leaky BucketSmoothing out traffic to a fixed downstream capacity (e.g. SMS, email, billing).Excess requests are delayed or dropped; no bursts allowed beyond queue size.
Fixed WindowSimple per-minute or per-hour quotas where edge bursts are acceptable.Allows up to 2x the limit at window boundaries; cheap to implement.
Sliding WindowAccurate quota enforcement without boundary bursts (e.g. paid API tiers).Higher memory and compute; needs per-request timestamps or weighted counters.

Copy-ready config snippets

nginx limit_req

nginx

# nginx.conf - per-IP rate limit
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=30r/m;

server {
    location /api/ {
        # Allow short bursts up to 3x the steady rate
        limit_req zone=api_limit burst=60 nodelay;
        limit_req_status 429;

        proxy_pass http://upstream;
    }
}

Express (express-rate-limit)

javascript

// npm install express-rate-limit
import rateLimit from "express-rate-limit";

const apiLimiter = rateLimit({
  windowMs: 60000, // minute window
  limit: 30, // 30 requests per minute per IP
  standardHeaders: "draft-7", // RateLimit-* headers
  legacyHeaders: false,
  message: { error: "Too many requests", retryAfter: 2 },
});

app.use("/api/", apiLimiter);

Cloudflare Rate Limiting

json

{
  "description": "API rate limit (30 req per minute)",
  "match": {
    "request": { "url": "*example.com/api/*" }
  },
  "threshold": 30,
  "period": 60,
  "action": {
    "mode": "challenge",
    "timeout": 2,
    "response": {
      "content_type": "application/json",
      "body": "{\"error\":\"rate_limited\",\"retry_after\":2}"
    }
  }
}

Redis token bucket (Lua)

lua

-- KEYS[1] = bucket key (e.g. "rl:user:42")
-- ARGV[1] = capacity (90)
-- ARGV[2] = refill rate per second (0.500)
-- ARGV[3] = now (unix seconds, with ms fraction)
-- Returns: { allowed (1|0), tokens_remaining, retry_after_seconds }

local capacity   = tonumber(ARGV[1])
local refillRate = tonumber(ARGV[2])
local now        = tonumber(ARGV[3])

local data = redis.call("HMGET", KEYS[1], "tokens", "ts")
local tokens = tonumber(data[1]) or capacity
local last   = tonumber(data[2]) or now

local elapsed = math.max(0, now - last)
tokens = math.min(capacity, tokens + elapsed * refillRate)

local allowed = 0
local retry   = 0
if tokens >= 1 then
  tokens  = tokens - 1
  allowed = 1
else
  retry = math.ceil((1 - tokens) / refillRate)
end

redis.call("HMSET", KEYS[1], "tokens", tokens, "ts", now)
redis.call("EXPIRE", KEYS[1], math.ceil(capacity / refillRate) * 2)

return { allowed, tokens, retry }

How to size your rate limit

Four steps to go from traffic estimate to a deployable throttling config.

  1. 1

    Estimate per-user request volume

    Enter the average number of requests one user makes per minute and select whether your user count is concurrent (active right now) or daily.

  2. 2

    Pick a burst multiplier

    Choose 2x, 3x, 5x, or 10x to allow short spikes above the steady rate. 2x to 3x suits transactional APIs; 5x or higher suits dashboards and batch jobs.

  3. 3

    Choose a time window and algorithm

    Select per-second, per-minute, per-hour, or per-day for the limit window, then pick token bucket, leaky bucket, fixed window, or sliding window.

  4. 4

    Copy the recommended config

    Paste the generated nginx limit_req, Express express-rate-limit, Cloudflare WAF, or Redis token bucket snippet into your service and deploy.

API rate limiting FAQ

Common questions about throttling, token buckets, 429 responses, and choosing the right rate limit for your service.

What is API rate limiting?

API rate limiting is a throttling technique that caps how many requests a client (user, IP, or API key) can send to a service within a given time window. It protects backend resources from overload, prevents abuse and brute-force attacks, and ensures fair access for all clients. Servers typically respond to throttled requests with HTTP 429 Too Many Requests and a Retry-After header.

Token bucket vs leaky bucket - which should I use?

Use the token bucket algorithm when you want to allow short bursts above the steady-state rate, which is the common case for human-driven APIs like dashboards. Use the leaky bucket algorithm when you need to smooth traffic to a fixed downstream capacity (such as a third-party SMS provider) where bursts would cause failures. Token bucket is more common in HTTP APIs; leaky bucket is more common in queueing and traffic-shaping scenarios.

How do I choose the right rate limit?

Start from your expected average request rate per user, multiply by the number of active users, then add a burst multiplier (typically 2x to 5x) to absorb normal spikes. Verify the resulting throughput is well under your backend's measured capacity, leaving headroom for growth and incident response. Tier limits by client type: anonymous traffic gets the strictest limit, authenticated users get a higher limit, and paying customers get the highest tier.

What HTTP status code should rate-limited requests return?

Return HTTP 429 Too Many Requests for clients that exceed their rate limit, and include a Retry-After header indicating how many seconds the client should wait before retrying. Optionally include X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers on every response so well-behaved clients can self-throttle. Avoid returning 503 Service Unavailable for individual rate limits, since 503 implies the entire service is down.

How do I handle burst traffic without throttling legitimate users?

Use a token bucket with a bucket size 2x to 5x the steady-state per-second rate so brief spikes drain the bucket without triggering 429s. Combine this with a higher burst tier for authenticated users and an even higher tier for paid customers. Monitor the ratio of 429 responses by client type: a sustained 429 rate above a few percent for legitimate clients usually means your limit is too low or your burst capacity is too small.

Beyond calculators

Elite Coders AI ships the throttler, not just the math.

Use this calculator to size your limits, then hand the implementation, load tests, and rollout to a full-stack AI developer for $2,500/month.

Visit the Elite Coders AI homepage