Cache & Load Balancing - Get SDE Ready

Caching refers to the practice of storing frequently accessed data in a fast-access storage layer (e.g., in-memory storage like Redis or Memcached) to reduce the time it takes to fetch the data. Instead of querying the database for each request, the application retrieves data from the cache, which is much faster.

For Uber/Ola, caching is critical to ensure a smooth experience for both passengers and drivers, as it can significantly reduce latency and improve system performance.


2. Types of Data to Cache in Uber/Ola

A. Ride Data

Ongoing rides: Information like the current status of a ride (e.g., waiting, in-progress, completed) can be cached so that the system doesn’t need to query the database for every user interaction.
Ride details: Data like pickup/drop-off location, estimated fare, and driver information can be cached to avoid repeated database queries.

B. User Data

User profiles: Caching frequently accessed data like user profile details (name, contact, payment info) reduces the need to fetch this data from the database for every request.
User preferences: Information like preferred payment methods or ride history can be cached for faster access.


C. Location Data

Driver and vehicle availability: The system frequently queries for the nearest available drivers for riders. Caching this data can help quickly determine which drivers are available in a specific area, reducing time and load on the database.
Geospatial data: Data related to location (e.g., maps, nearby points of interest, current traffic conditions) can be cached for quicker access.


D. Price Estimation Data

Fare estimates: Calculating fares involves complex logic (distance, time, surge pricing, etc.). Storing pre-calculated fare estimates for popular routes can improve system responsiveness.


3. Cache Strategies for Uber/Ola

A. Time-based Expiry (TTL - Time To Live)

Caches are set with a TTL value, meaning that they automatically expire after a set time. For instance, location data or fare estimations might change frequently, so their cache time would be short (minutes or hours).
Example: Cached driver availability data might expire every 5 minutes to reflect real-time changes in driver availability.

B. Write-through Caching

In a write-through cache, when data is written to the database, it’s also written to the cache. This ensures the cache is always consistent with the database.
Example: When a user books a ride, the ride details are written to both the database and the cache so that subsequent requests fetch the ride details from the cache.

C. Cache Eviction Policies

Least Recently Used (LRU): The least recently accessed data is evicted from the cache to make room for new data. This is ideal for Uber/Ola, as it ensures that frequently used data (e.g., ride details, driver info) stays in the cache.
Least Frequently Used (LFU): Evicts data that is accessed the least. This might be useful for less frequently used data like user preferences or past ride data.


4. What is Load Balancing?

Load balancing is the practice of distributing incoming traffic across multiple servers or resources to ensure no single server becomes overloaded. In systems like Uber/Ola, load balancing ensures that requests from users and drivers are efficiently handled, improving response times and fault tolerance.


5. Types of Load Balancing

A. Round Robin Load Balancing

Round Robin distributes incoming requests evenly across a pool of servers. It works well for distributing the load of similar types of requests across multiple servers.
Example for Uber/Ola: A load balancer can distribute incoming ride requests evenly across multiple backend servers responsible for processing ride data.

B. Least Connections Load Balancing

Least Connections directs traffic to the server with the fewest active connections. This method is useful when server performance is tied to the number of active requests.
Example for Uber/Ola: When a server is already handling multiple ride requests, the load balancer might direct new requests to servers with fewer active sessions.


C. Weighted Load Balancing

Weighted Load Balancing assigns a weight to each server based on its capacity. A server with higher capacity will receive more traffic than one with lower capacity.
Example for Uber/Ola: A server located in a data center with higher capacity or closer to a major city might be given a higher weight, handling more traffic from users in that area.


D. Geo-based Load Balancing

Geo-based Load Balancing directs traffic based on the geographical location of the user or driver. For example, users in India would be directed to servers in Asia, while users in the US would connect to servers in North America.
Example for Uber/Ola: This helps in reducing latency for both drivers and passengers, ensuring faster and more responsive rides.


6. Why Cache and Load Balancing are Important for Uber/Ola

A. Scalability

Caching reduces the load on databases, allowing the system to scale as the number of users and rides increases.
Load balancing distributes traffic evenly, preventing bottlenecks and ensuring the system can handle large volumes of requests during peak times (e.g., surge hours).


B. Reduced Latency

Caching makes frequently accessed data available instantly, significantly reducing the time it takes for users to get a response.
Load balancing ensures that traffic is efficiently routed, preventing delays caused by overloaded servers.


C. High Availability

Caching ensures that even if the database is temporarily unavailable, cached data can still be served to users, minimizing downtime.
Load balancing ensures fault tolerance, as traffic can be rerouted to healthy servers in case of failure, maintaining the availability of the service.


D. Improved User Experience

Caching provides fast access to data like nearby drivers or available cars, improving response times for ride requests.
Load balancing ensures a stable and responsive experience, even during times of high demand, such as peak hours or busy locations.


7. Cache and Load Balancing Challenges for Uber/Ola

A. Cache Invalidation

Cache invalidation (ensuring that outdated data is not served) can be tricky. For example, when a ride is completed, the cached ride data must be updated or removed.
If the cache is not invalidated properly, users might see outdated information.


B. Data Consistency

When using caching strategies like write-through or write-back, it’s crucial to maintain consistency between the cache and the database. This is challenging when there are multiple servers or data centers.


C. Load Balancer Configuration

The configuration of load balancers must be carefully planned to avoid issues like traffic bottlenecks or unequal distribution of load.
Balancing load in geo-distributed systems (e.g., Uber/Ola operating across many countries) requires intelligent routing to optimize server usage.


D. Server Failures

Load balancing mechanisms need to handle the failure of a server gracefully. If a server goes down, the load balancer should quickly detect the failure and reroute traffic to healthy servers.

0% Complete

Quick Links

Quick Links

Social Media

Quick Links

Quick Links

Social Media

Hi Instagram Fam! Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam! Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design