Course Content
Data Structures & Algorithms
Low Level System Design
LLD Topics
High Level Design
DSA, High & Low Level System Designs
About Lesson
1. Requirements of a Rate Limiter

Functional Requirements:

These are the core functionalities that the rate limiter should provide to meet its goals.

 

Request Tracking: The system should be able to track the number of requests made by each user or client in a given time period.

 

Enforcement of Limits: The system should ensure that requests exceeding the defined rate limits are rejected or delayed until they are allowed again.

 

Configurable Limits: The rate limiter should allow configuring different rate limits based on factors such as:

 

  • User or client IP address
  • API endpoint
  • Request type (e.g., GET, POST, etc.)
  • Time window (e.g., per minute, per hour)

 

Granular Limiting: The system should allow for fine-grained control over who is being limited. This could be based on:

 

  • User ID
  • IP address
  • API key or token
  • Device or session

 

Graceful Handling of Overages: When a user exceeds the rate limit, the system should handle this gracefully, e.g., by sending back a 429 Too Many Requests HTTP status code, possibly with details about when they can make further requests.

 

Reset Mechanism: The rate limiter should reset the count of requests after a certain period (e.g., every minute, hour, or day), and provide a predictable timeline for users to know when the limit will be reset.

 

Custom Response: The rate limiter should be able to return custom messages to users when their requests are throttled or blocked, explaining the reason and when they can retry.


 

Non-Functional Requirements:

These describe the system’s performance characteristics and limitations.

 

Scalability: The rate limiter should be able to handle a high number of requests, especially in a large, distributed system. As traffic increases, the system should scale without significantly impacting performance.

 

Low Latency: The rate limiting process should introduce minimal overhead so that it does not slow down the overall system. It should be fast enough to ensure that user requests are handled promptly.

 

Reliability: The rate limiter should be highly reliable, meaning it must work as expected even under heavy load or failure conditions. The system should be fault-tolerant to avoid disruptions.

 

Efficiency: The system should manage resources efficiently, using minimal memory and processing power, especially if it needs to track and store request counts for millions of users.

 

Persistence: Depending on the implementation, the rate limiter may need to store request counts (e.g., in memory, a database, or cache). The system should be designed to handle persistent or ephemeral storage efficiently.

 

Fairness: It should ensure fair usage, meaning no single user or client should be able to monopolize resources by sending too many requests. It should distribute resources in a balanced way across users.



2. Goals of the Rate Limiter

The goals provide the broader context of what the system should achieve and why these features are important.

 

Primary Goals:

 

Prevent Overload of Resources: The most critical goal of rate limiting is to protect the backend servers or services from being overwhelmed. By enforcing rate limits, we ensure that the system can handle only as many requests as it can process without crashing or becoming unresponsive.

 

Ensure Fair Usage: A key goal is to provide equitable access to resources, preventing any user from consuming too much bandwidth or CPU resources at the expense of others. It helps ensure that all users can interact with the service in a reasonable way.

 

Protect Against Abuse and Malicious Activities: Rate limiting can help mitigate security risks, such as brute-force attacks, denial-of-service (DoS) attacks, and spamming, by limiting how often an attacker can make requests.

 

Provide Predictable API Behavior: With rate limits in place, users and developers can better understand the service’s limits, helping them plan their usage and avoid unexpected disruptions.

 

Improve System Performance: Rate limiting can also optimize performance by controlling the volume of requests. This helps keep the system responsive by preventing it from becoming bogged down with too many concurrent requests.

 

Secondary Goals:

 

Optimize Resource Usage: Efficient use of resources such as memory, CPU, and storage is essential for scalability. The system should minimize its memory footprint while effectively tracking requests and applying limits.

 

Minimize Latency for Users: While the rate limiter must track and manage requests, it should do so in a way that does not introduce significant delays for users. The process of checking and enforcing limits should be fast enough that it doesn’t impact the overall system response time.

 

Maintain High Availability: The rate limiter should be resilient and always available, even during periods of high load. It should not become a single point of failure in the system.

 

Support Multiple Limit Strategies: The rate limiter should be flexible enough to support different limiting strategies, such as fixed window, sliding window, token bucket, or leaky bucket, depending on the use case.



Summary of Requirements & Goals

In summary, the rate limiter must:

 

  1. Track and enforce request limits to prevent system overload.
  2. Be configurable, allowing limits to be set based on various factors.
  3. Be efficient and scalable to handle high traffic while ensuring minimal latency.
  4. Protect against abuse, ensuring fairness and security.
  5. Provide predictable, reliable service while remaining highly available.

 

By achieving these goals, a rate limiter ensures that systems can handle traffic gracefully, protect resources, and deliver a consistent user experience.

0% Complete
WhatsApp Icon

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.