API Rate Limiting Notes

2024-03-21 07:30:00 #安全 #性能 #后端 # 后端实践

Login, SMS sending, and report export are the three endpoints most likely to fail under traffic spikes. The problem is not that their business logic is unusually complex. The problem is that all three naturally require rate limiting.

Common Mistakes

Applying rate limits only at the gateway and ignoring business-level differences
Using one threshold for everything and missing the gap between risky and ordinary endpoints
Returning a generic busy message without recording the real trigger reason

A Layered Strategy That Is Usually Enough

Let the gateway absorb abnormal spikes
Apply a second limit in business logic by user, IP, or tenant
Write the matched rule clearly into logs

Estimation Model

With a token bucket model, the number of requests that can be processed in a time window can be estimated as:

$$
allowedRequests = rate \times window + burst
$$

Here, rate is the steady throughput and burst is the temporary peak allowance.

Practical Advice

1
2
3

function shouldThrottle(count: number, limit: number) {
  return count >= limit
}

The sample code is simple. The harder part is building the right observability:

Whether normal users are blocked by mistake
Whether the request source can be identified quickly after a hit
Whether it works together with security rules and verification-code policies

Conclusion

Rate limiting is not about rejecting more requests. It is about keeping core capabilities stable when the system is under abnormal pressure. Thresholds can be tuned over time, but instrumentation and alerts should exist from the start.

2024-03-22 13:15:00