Rate Limiter

12 min readBy Saman Kefayatpour
Rate Limiter

A Rate Limiter is a mechanism used to control the rate of incoming requests to a server or API. This blog post discusses the importance of rate limiting, common strategies, and implementation techniques.

Importance of Rate Limiting

Rate limiting is crucial for several reasons:

  1. Preventing Abuse: It helps protect servers from being overwhelmed by too many requests, which can lead to denial of service (DoS) attacks.
  2. Fair Usage: It ensures that resources are distributed fairly among users, preventing any single user from monopolizing the service.
  3. Cost Management: By limiting the number of requests, it helps manage costs associated with bandwidth and server usage.
  4. Improved Performance: It can enhance the overall performance of the system by reducing latency and ensuring that resources are available for legitimate users.

Common Rate Limiting Strategies

  1. Fixed Window Counter: This strategy limits the number of requests in a fixed time window (e.g., 100 requests per minute). Once the limit is reached, further requests are denied until the next time window. But can lead to traffic spikes at the edges of the time windows.
  2. Sliding Window Log: Similar to the fixed window, but the time window "slides" with each request. This provides a more granular control over the rate limiting.
  3. ** Sliding Window Counter**: This method combines the fixed window and sliding window approaches by maintaining counters for multiple smaller time windows within a larger window.
  4. Token Bucket: In this approach, tokens are added to a bucket at a fixed rate. Each request consumes a token. If the bucket is empty, requests are denied until new tokens are added.
  5. Leaky Bucket: This strategy allows requests to be processed at a fixed rate, regardless of the incoming request rate. Excess requests are queued and processed at a steady rate.

Choosing the Right Strategy

The choice of rate limiting strategy depends on the specific requirements of the application, including:

  • Traffic Patterns: Understanding the expected traffic patterns can help in selecting an appropriate strategy.
  • User Experience: Consider how rate limiting will impact the user experience and choose a strategy that minimizes disruption.
  • Scalability: Ensure that the chosen strategy can scale with the growth of the application and its user base.
  • Complexity: Evaluate the complexity of implementation and maintenance for each strategy.

Compared Strategies

StrategyProsConsMost common real-world use case
Fixed Window CounterSimple to implementCan lead to bursts of traffic at window boundariesSimple quotas (100/day, 5000/month)
Sliding Window LogMore granular controlHigher memory usageMost Fair, Precise per-user limiting
Sliding Window CounterBalances control and memory usageRequires careful weighted calculationGood Balance between Fixed and Sliding Window Log
Token BucketFlexible and allows burstsMore complex to implementMost popular for APIs (Stripe, AWS, Google,...)
Leaky BucketSmooths out traffic spikesCan introduce latencyTraffic shaping, queues, constant-rate systems

In simple terms, Token Bucket is the most popular strategy used in real-world applications due to its flexibility and ability to handle bursts of traffic effectively. Fixed Window Counter is often used for simple quota systems, while Sliding Window Log provides the most fairness at the cost of higher memory usage. in Sliing window Counter offers a balanced approach between control and resource consumption. Leaky Bucket is ideal for scenarios requiring smooth traffic flow.

Implementation Techniques

Rate limiting can be implemented using various techniques, including:

  1. In-Memory Storage: Using data structures like hash maps to track request counts in memory. This is suitable for single-server applications but may not scale well in distributed systems.
  2. Distributed Caches: Utilizing distributed caching systems like Redis or Memcached to store request counts. This approach is more scalable and can handle multiple servers.
  3. API Gateways: Many API gateways provide built-in rate limiting features that can be configured to enforce limits on incoming requests.
  4. Middleware: Implementing rate limiting as middleware in web frameworks (e.g., Express.js, Django) allows for easy integration and customization.

Implementation Example of Token Bucket in Express.js

Let's implement a Token Bucket rate limiter in an Express.js application using TypeScript. Token Bucket is one of the most popular rate limiting strategies. Below is a simple implementation of a Token Bucket rate limiter.

Token Bucket Class:

export interface TokenBucketOptions { capacity: number; // max tokens refillRate: number; // tokens per second } export class TokenBucket { private capacity: number; private tokens: number; private refillRate: number; private lastRefill: number; constructor(options: TokenBucketOptions) { this.capacity = options.capacity; this.tokens = options.capacity; this.refillRate = options.refillRate; this.lastRefill = Date.now(); } private refill() { const now = Date.now(); const elapsedSeconds = (now - this.lastRefill) / 1000; const refillTokens = elapsedSeconds * this.refillRate; this.tokens = Math.min(this.capacity, this.tokens + refillTokens); this.lastRefill = now; } consume(tokens: number = 1): boolean { this.refill(); if (this.tokens >= tokens) { this.tokens -= tokens; return true; } return false; } getTokens() { this.refill(); return this.tokens; } }

Rate Limiter Class:

import { TokenBucket } from "./token-bucket"; const buckets = new Map<string, TokenBucket>(); //it can be in redis interface RateLimitConfig { capacity: number; refillRate: number; } export class RateLimiter { constructor(private config: RateLimitConfig) {} private getBucket(key: string): TokenBucket { if (!buckets.has(key)) { buckets.set( key, new TokenBucket({ capacity: this.config.capacity, refillRate: this.config.refillRate, }), ); } return buckets.get(key)!; } allowRequest(key: string): boolean { const bucket = this.getBucket(key); return bucket.consume(1); } }

Middleware for Express.js:

import { Request, Response, NextFunction } from "express"; import { RateLimiter } from "./rate-limiter"; const limiter = new RateLimiter({ capacity: 10, // max 10 requests refillRate: 2, // 2 requests per second }); export function rateLimiterMiddleware( req: Request, res: Response, next: NextFunction ) { const key: string = req.ip ?? ""; // or userId, apiKey, etc. if (!limiter.allowRequest(key)) { return res.status(429).json({ message: "Too Many Requests", }); } next(); }

Example of using the middleware in an Express.js application:

import express from "express"; import { rateLimiterMiddleware } from "./middleware"; const app = express(); app.use(rateLimiterMiddleware); app.get("/", (req, res) => { res.send("Hello, rate-limited world!"); }); app.listen(3000, () => { console.log("Server running on http://localhost:3000"); });

If you want to see the full implementation with redis, you can check out the this sample repository: Source Code

Conclusion

Rate limiting is an essential technique for managing traffic and ensuring the stability and security of web applications. By understanding the different strategies and implementation techniques, developers can choose the most appropriate approach for their specific use case. The Token Bucket strategy is widely used in real-world applications due to its flexibility and effectiveness in handling bursts of traffic. Implementing rate limiting can help protect your application from abuse, ensure fair usage, and improve overall performance.