High Level Design - Get SDE Ready

DSA, High & Low Level System Designs

About Lesson

1. Introduction to Rate Limiting

Start with an introduction to Rate Limiting. Explain its purpose:

Purpose: A rate limiter is a tool designed to control the number of requests a user or client can make to a service in a given time frame.

Use Cases:

Prevent abuse (e.g., DDoS attacks or brute force attempts).
Ensure fair usage of resources.
Protect backend services from being overwhelmed by too many requests.

Rate Limiting Algorithms: Mention some of the common algorithms, like:

Fixed Window: A fixed number of requests within a time window.
Sliding Window: Similar to fixed, but the time window slides forward.
Token Bucket: Allows bursts of traffic, but ensures that requests are regulated over time.
Leaky Bucket: Similar to token bucket, but requests are processed in a steady flow.


2. Components of the High-Level Design

The high-level design should include the following components, each responsible for a specific part of the rate limiting system:
A. Request Handling System

API Gateway or Proxy:

This component sits between the user and the backend services. All incoming requests from users first pass through the API Gateway.
It checks the rate limit before passing the request to the backend service.
If the rate limit is exceeded, it returns an HTTP 429 Too Many Requests error to the client.

Rate Limiter Logic:

This is the core logic that checks if a request can be processed based on the rate limit and the time window. The logic can vary depending on the algorithm (fixed window, sliding window, etc.).
It interacts with data storage (e.g., databases or in-memory caches like Redis) to track the number of requests and enforce limits.
B. Storage Layer

In-Memory Data Store (e.g., Redis):

For high performance and low latency, rate limiting counters (e.g., the number of requests made in a given time window) are often stored in in-memory caches like Redis.
Redis offers features like TTL (Time-to-Live) to automatically expire data once the time window is over, making it ideal for tracking requests over time.

Persistent Database (optional):

In some cases, a persistent database (e.g., PostgreSQL, MySQL, etc.) might be used to store user information, rate limits, and historical data. However, using a database for tracking every request might be too slow and resource-intensive, so this is typically only for more critical data or auditing.
C. Rate Limit Policy Management

Rate Limit Rules Configuration:

A configuration layer allows administrators to set or update rate limits for different resources or users.
Policies could be based on user roles, API keys, or even IP addresses. For example, some resources might have stricter limits than others.
Dynamic Rate Limits: Some systems allow rate limits to be adjusted based on external factors (e.g., load on the system).
D. Monitoring and Metrics

Logging and Analytics:

It’s essential to monitor the rate-limiting system to ensure it functions correctly and efficiently. Logs can capture:
Number of requests processed.
Number of rate limit violations.
The effectiveness of different rate-limiting policies.

Alerting:

Set up alerts for abnormal behaviors, such as spikes in traffic or frequent rate-limit breaches, to quickly react to potential issues like DDoS attacks or abuse.


3. Flow of the Rate Limiter System

You can explain the system flow with a simple step-by-step process. Here’s an example flow of a user making an API request:

User Request:

A user (or client) makes an HTTP request to an API endpoint (e.g., /login).

API Gateway:

The request hits the API Gateway or Proxy, which acts as an intermediary.

Check Rate Limit:

The API Gateway calls the Rate Limiter logic to check if the user is within the allowed rate limit for the requested resource.
The Rate Limiter looks up the current request count for the user in the relevant time window (e.g., within the last minute).

Decision:

If the user is within the limit (i.e., the number of requests is less than the allowed limit):
The request is passed to the backend API for processing.
If the user exceeds the limit (i.e., the number of requests is greater than the allowed limit):
The API Gateway responds with an HTTP 429 Too Many Requests error, and the user is asked to try again after a specific wait time (e.g., Retry-After header).

Store the Request Count:

If the request is allowed, the Rate Limiter updates the request count in the data store (e.g., Redis).
If the time window has passed, the system will reset the counter for the user.

Response:

The response from the backend API is sent back to the user, depending on whether the request was successful or not.


4. Rate Limiting Strategies

It’s important to explain the different strategies for implementing rate limiting. Some of the common ones include:

A. Fixed Window

Description: The rate limit is applied to a fixed window of time, such as per minute, per hour, or per day.
Example: A user can make 100 requests per minute. After 100 requests, they need to wait until the start of the next minute to make more requests.
Pros: Simple to implement.
Cons: Can cause bursts or spikes of requests right before the window resets.

B. Sliding Window

Description: Similar to the fixed window but more dynamic. The rate limit is applied over a sliding window, such as the last 60 seconds.
Example: If a user can make 100 requests per minute, and they make a request every 30 seconds, the system ensures they don’t exceed 100 requests within any 60-second window.
Pros: More granular and fairer than fixed window.
Cons: Slightly more complex to implement.

C. Token Bucket

Description: Allows bursts of requests by storing “tokens” that refill over time. A user can make a request only if they have a token.
Example: If a user is allowed 100 requests per minute, the system starts with 100 tokens. Each request consumes a token, and the system refills tokens over time (e.g., one token every 600 milliseconds).
Pros: Supports bursts while enforcing long-term limits.
Cons: More complex to implement than fixed or sliding windows.

D. Leaky Bucket

Description: Similar to the token bucket, but requests are processed in a steady rate. Excess requests overflow if the bucket is full.
Example: The user can make 100 requests in a minute, but the system processes requests at a constant rate (e.g., 1 request per second).
Pros: Ensures a steady flow of traffic.
Cons: Cannot handle bursts as well as token bucket.


5. Scalability and High Availability

Horizontal Scaling: Rate limiting systems should be able to scale horizontally. This means that as traffic grows, additional instances of the rate-limiting system (API gateways, databases, caches) can be added.
Distributed Rate Limiting: If the system is distributed (e.g., across multiple regions), rate limits should be coordinated across instances to ensure consistency.
Data Storage (Redis or DB): To support large-scale systems, rate-limiting counters are often stored in distributed in-memory stores like Redis, which can handle high throughput and low latency.
Fault Tolerance: Ensure the system can handle failures gracefully, including fallback mechanisms in case the rate limiter service or database goes down.


6. Summary of High-Level Design

To summarize the high-level design of a rate limiter:

Components:

API Gateway: Checks requests before passing them to backend services.
Rate Limiter Logic: Enforces rate limits using algorithms like fixed window, sliding window, token bucket, or leaky bucket.
Data Storage: Stores counters and request logs (e.g., using Redis or a persistent database).
Rate Limit Policies: Defines configurable limits per user, resource, or API key.
Monitoring and Analytics: Tracks system performance and rate limit violations.

Flow:

User makes a request.
The API Gateway checks if the request is within the rate limit.
If allowed, the request is processed. If not, a 429 Too Many Requests response is returned.

Quick Links

Quick Links

Social Media

Quick Links

Quick Links

Social Media

Hi Instagram Fam! Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam! Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design