Course Content
Data Structures & Algorithms
Full Stack Web Development
Understanding and playing with DOM (Document Object Model)
0/2
MERN project
0/2
Low Level System Design
LLD Topics
High Level System Design
Fast-Track to Full Spectrum Software Engineering
1. Why is Purging & DB Cleanup Important?

  • Storage Management: Rate limit data, such as the count of requests and timestamps, can grow rapidly, especially in high-traffic systems. Regular purging prevents the system from using too much memory or storage.
  • Performance: Keeping old data in the database can lead to slower queries and degraded performance. Cleaning up old data ensures that the system remains fast and efficient.
  • Cost Efficiency: Storing large volumes of outdated data in databases or caches (like Redis) can incur unnecessary costs. Periodic cleanup reduces storage overhead.
  • Accuracy: Ensuring that only relevant data is kept allows for accurate rate limiting and helps avoid miscalculations due to stale or expired records.


2. When Should You Purge or Clean Up?

  • Time Window Expiry: Data should be purged when a time window expires. For example, if the rate limit is tracked per minute, after that minute has passed, the data should be removed to make space for new data.
  • Exceeded Rate Limits: If a user exceeds the rate limit, the request count for that user or IP address should be reset after a defined time window or period (e.g., a minute or an hour).
  • Idle Data: In cases where there is a lack of requests from certain users or services, old records can be removed to keep the database lean.


3. How Does Purging Work?

Purging typically involves the removal of old data based on a few strategies:

 

TTL (Time-to-Live) in Caches:

 

  • Redis and other caching systems support a TTL feature that automatically expires keys after a certain amount of time.
  • For example, if the rate limit is tracked per minute, the TTL for each user’s request count can be set to 1 minute.
  • Once the TTL expires, Redis automatically purges the data associated with the user, so the count is reset without needing manual intervention.

 

Manual Purging Based on Time Windows:

 

  • If the rate-limiting logic uses time-based windows (e.g., 1 minute, 1 hour), expired data can be purged at the end of each window.
  • This might require a batch process or scheduled job that runs at regular intervals (e.g., every minute) to clear out expired entries.
  • For example, if you are storing request counts for each user in a database, you would run a cleanup process at the start of each new time window to clear out data for the previous window.

 

Purge After Rate Limit Reset:

 

  • Once a time window for rate-limiting resets, the counter for each user should be cleared.
  • This ensures that the next time the user makes a request, they start fresh with their request count at 0.


4. Techniques for Database Cleanup

Using Expiration (TTL):

 

  • Redis provides built-in TTL support that automatically removes keys after a specified expiration time.
  • In a Redis-based rate-limiting system, you can set a TTL for each user’s request count so that after the time window expires (e.g., 1 minute), the counter is automatically cleared.

 

Batch Cleanup Jobs:

 

  • If using a relational database (e.g., MySQL, PostgreSQL), you might need to run scheduled jobs (e.g., CRON jobs) to clean up old records.
  • For example, if you store the rate-limiting data in a table with a timestamp, a job could be scheduled to delete entries older than the current time window (e.g., deleting request counts older than 1 minute).

 

Archiving Old Data:

 

  • Instead of purging data completely, you could archive older records to another storage (e.g., a log file or a separate database table) for auditing or reporting purposes.
  • This can be helpful if you need to maintain historical data but don’t want it clogging up the rate-limiting system’s performance.

 

Periodic Cleanup Using Redis Data Structures:

 

  • If using Redis for rate-limiting, Sorted Sets can be used to store request timestamps. The sorted set allows for efficient querying of entries within a time window.
  • You can periodically run Redis commands (e.g., ZREMRANGEBYSCORE) to remove entries older than the allowed time window.


5. Example: Purging with Redis

Let’s say you are using Redis to implement rate limiting with a Fixed Window algorithm, where the user can make 100 requests per minute. Each user’s request count is stored with a key such as user:{user_id}:requests.

 

  • Set TTL: When a request is made, Redis stores the user’s request count with a TTL of 60 seconds. After 60 seconds, the counter automatically expires and is removed, ensuring that the user’s rate limit is reset.

    SETEX user:1234:requests 60 50 # Sets the request count for user 1234 with a TTL of 60 seconds.
     
  • Reset Count: After 60 seconds, the user:1234:requests key is automatically purged by Redis. This ensures that after the time window expires, the user starts with a fresh count.



6. Example: Batch Cleanup with a Relational Database

Suppose you’re storing user request counts in a database and want to clean up data manually at the end of each time window.

 

  • Schema: A table called user_requests that tracks user_id, request_count, and timestamp.

     

     

    user_id request_count timestamp
    1234 50 2025-03-31 12:00:00
    5678 30 2025-03-31 12:00:00

     

Cleanup Process:

 

  • A CRON job or batch process runs every minute to delete records older than the current time window.
DELETE FROM user_requests WHERE timestamp < NOW() - INTERVAL 1 MINUTE;

 

  • This deletes all records older than 1 minute, ensuring that only the current window’s request counts are stored.


7. Best Practices for Purging & Cleanup

  1. Automate the Process: Use TTL features of caching systems like Redis, or schedule batch jobs for databases to automatically handle purging and cleanup.
  2. Monitor Storage Growth: Continuously monitor how much storage is used by rate-limiting data, especially if using in-memory storage like Redis.
  3. Ensure Consistency: Ensure that data cleanup doesn’t lead to inconsistencies or missed requests. If a request is mistakenly purged, the user might be falsely blocked or allowed to exceed their limit.
  4. Archiving for Auditing: If your system needs to maintain historical request data for auditing purposes, consider archiving older data rather than purging it completely.
0% Complete
WhatsApp Icon

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.