Designing API Rate Limits: Token Bucket vs Sliding Window
1 min read
The two approaches
Token bucket: Refill tokens at a fixed rate. Each request consumes one. Simple, efficient, but allows bursts up to the bucket size.
Sliding window: Count requests in the last N seconds. More precise rate enforcement, but requires tracking timestamps per user.
Token bucket implementation
type TokenBucket struct {
capacity int
tokens int
rate float64
lastRefill time.Time
}
func (tb *TokenBucket) Allow() bool {
now := time.Now()
elapsed := now.Sub(tb.lastRefill).Seconds()
tb.tokens = min(tb.capacity, tb.tokens + int(elapsed * tb.rate))
tb.lastRefill = now
if tb.tokens > 0 {
tb.tokens--
return true
}
return false
}
When each makes sense
- Token bucket for per-endpoint rate limits where bursts are acceptable
- Sliding window for per-user limits where fairness matters
We use token bucket for anonymous API keys and sliding window for paid tiers.
What I learned
Start with token bucket. It’s O(1) per request and easy to reason about. Only add sliding window when you have a user complaining about unfair rate limiting.