Course Content
Data Structures & Algorithms
Full Stack Web Development
Understanding and playing with DOM (Document Object Model)
0/2
MERN project
0/2
Low Level System Design
LLD Topics
High Level System Design
Fast-Track to Full Spectrum Software Engineering
1. Why Do We Need Partitioning & Replication?

Improves query performance – Reduces data lookup time by dividing the database.
Handles massive traffic – Distributes load efficiently across multiple servers.
Ensures high availability – Provides redundancy and prevents downtime.
Supports scalability – Makes it easier to scale as the system grows.



2. Database Partitioning (Sharding) for TinyURL

Partitioning (Sharding) divides the database into smaller chunks to distribute storage and query load across multiple servers.

 

A. Types of Partitioning for TinyURL

1. Horizontal Partitioning (Sharding) – Best for Large Scale

 

Approach: Store different sets of URLs across multiple databases based on the short URL ID or hash value.


Example: If we have 4 database shards, we can store short URLs based on hash mod (4).

 

Short URL Hash Target Shard
hash("abc123") % 4 = 1 DB1
hash("xyz456") % 4 = 2 DB2
hash("pqr789") % 4 = 3 DB3
hash("lmn567") % 4 = 0 DB4

 

Advantages:

Distributes storage load evenly.
Reduces query time by limiting searches to a single shard.
Supports large-scale data growth.

 

Implementation Example (Using Hash Modulo Sharding)
 
def get_shard(url_id, total_shards=4):
return hash(url_id) % total_shards # Returns shard number


2. Vertical Partitioning – Based on Data Type

Approach: Store frequently accessed data (e.g., short URL and long URL) separately from less frequently accessed data (e.g., analytics, expiration date, metadata)


Example:

 

  • DB1 (Core Data): short_url, long_url, created_at, expires_at
  • DB2 (Analytics): short_url, click_count, user_agent, location
  • DB3 (Blacklist Data): malicious_urls, reported_by

 

Advantages:

 

Improves read performance for core data.
Keeps analytics data separate to avoid slowing down URL lookups.



3. Range-Based Partitioning – Based on Short URL ID

Approach: Store URLs in different tables or databases based on ID ranges.

Example:

 

  • DB1: short_url_id 1-10M
  • DB2: short_url_id 10M-20M
  • DB3: short_url_id 20M-30M

 

Advantages:


Efficient range-based queries.
Simple to implement for sequentially generated IDs.



B. Choosing the Right Partitioning Strategy

Partitioning Type Best Use Case
Hash-Based Sharding Large-scale URL shortening system with even distribution
Range-Based Sharding Sequentially generated short URLs
Vertical Partitioning Separating frequently accessed data from analytics


3. Database Replication for TinyURL

Replication ensures high availability, data redundancy, and load balancing by maintaining multiple copies of the database.


A. Types of Replication

1. Master-Slave Replication

Approach

 

  • Master DB – Handles writes (creating short URLs).
  • Slave DBs – Handle reads (resolving short URLs).

 

Example Setup:

  • Master DB (Write-heavy) – Stores new URLs.
  • 3 Read-Replica Slaves (Read-heavy) – Handle URL redirection requests.

 

Advantages:


Offloads read queries to slaves.
Improves performance for read-heavy traffic.
Provides backup copies for disaster recovery.

 

Implementation (Using MySQL):

 
CHANGE MASTER TO MASTER_HOST='master-db',
MASTER_USER='replica_user',
MASTER_PASSWORD='password',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS= 107;
START SLAVE;


2. Master-Master Replication (For High Availability)

Approach: Both databases can handle reads and writes, syncing with each other.
Use Case: Large-scale systems requiring high availability (e.g., global TinyURL system).
Disadvantages: More complex conflict resolution.



3. Read Replicas with Load Balancing

Approach: Use multiple read replicas and a load balancer (e.g., HAProxy, Nginx) to distribute traffic.

 

Example Load Balancer Setup (Nginx Reverse Proxy for DB Replicas):

 
upstream read_replicas {
server db-replica1:3306;
server db-replica2:3306;
server db-replica3:3306;
}
server {
location / {
proxy_pass http://read_replicas;
}
}

 

Benefit: Automatically directs requests to the least busy read replica.



4. Combining Partitioning & Replication

For massive-scale systems (handling billions of URLs), we can combine partitioning and replication.

 

Hybrid Strategy:


Sharding – Divide data across multiple DB servers.
Replication – Each shard has multiple replicas for load balancing.
Caching – Use Redis/Memcached for frequently accessed short URLs.

 

Example System Architecture:

  • Shards: URLs are sharded by hash (Shard 1, Shard 2, Shard 3).
  • Replication: Each shard has one master and multiple read replicas.
  • Load Balancer: Directs queries to replica servers for quick lookups.
  • Cache Layer: Redis stores popular URLs for fast access.


5. Final Takeaways

  • Partitioning (Sharding): Splits data across multiple servers for scalability.
  • Replication: Ensures redundancy and improves read performance.
  • Best Strategy: Use hash-based sharding for URL distribution & master-slave replication for fast reads.
  • Optimization: Use load balancers & caching for high traffic.
0% Complete
WhatsApp Icon

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.