System Design Interview Questions

High-Level Design of Facebook Messenger

Introduction

Facebook Messenger powers real-time conversations for over 500 million active users daily. Building such a system demands expertise in scalable architectures, distributed data stores, and low-latency communication. If you’re preparing for a system design interview, our complete system design training provides hands-on projects to master these concepts.

Core Requirements and Goals

Functional Requirements

  • User Management: Registration, authentication, profile updates, and contact management.
  • Messaging: 1:1 and group chat support for text, images, videos, and files.
  • Real‑Time Delivery: Instant message updates via persistent connections.
  • Notifications: In-app and push alerts for new messages.
  • End‑to‑End Security: Encryption to ensure privacy.

For a quick primer on system design fundamentals, explore our system design crash course.

Non‑Functional Requirements

  • Scalability: Seamlessly handle millions of concurrent users.
  • Low Latency: Deliver messages in under 100 ms.
  • High Availability: Fault‑tolerant services with multi‑region failover.
  • Data Consistency: Prevent message loss or duplication.
  • Storage Efficiency: Optimize for petabytes of text and media.

Capacity Planning and Constraints

What traffic can Facebook Messenger handle?

Assuming 25 billion messages per day, with an average of 2.33 reads per message:

  • Write Requests: 25 B
  • Read Requests: 58 B
  • Total Requests: 83 B per day

That translates to an average of 960 K queries per second (QPS) and a peak of approximately 4.8 M QPS.

How much storage is needed?

  • Daily Text Storage: 25 TB
  • Annual Text Storage: 9 PB
  • 5‑Year Projection (Text + Media): 450 PB

Bandwidth and Cache Requirements

  • Daily Bandwidth: 1.3 PB (text + media) ≈ 15 GB/sec
  • Caching: 10 % of messages (2.5 TB) stored in Redis for hot reads, plus 5 TB for metadata/indexing.

For insights into data analysis at scale, check our data science specialization.

How much storage is needed

System APIs Overview

A RESTful and WebSocket‑based API surface handles user interactions:

  1. Authentication (POST /api/login)

    • Issues JWT tokens for session management.

  2. Send Message (POST /api/messages)

    • Persists to NoSQL, updates SQL conversation metadata, and pushes via Kafka/WebSockets.

  3. Fetch Messages (GET /api/messages?chat_id={id})

    • Queries Cassandra/DynamoDB with Redis caching and pagination.

  4. Fetch Conversations (GET /api/conversations?user_id={id})

    • Joins SQL tables for recent chat summaries and unread counts.

  5. User Presence (GET /api/user-presence?user_id={id})

    • Retrieves online/offline status from Redis.

  6. Group Chat: Create, send, fetch, and manage participants via /api/groups endpoints.

For a hands‑on introduction to building real‑time web applications, explore our web development bootcamp.

Database Schema: SQL and NoSQL Hybrid

To balance consistency and performance, Messenger uses:

  • SQL (PostgreSQL/MySQL)

    • Users, conversations, attachments, and group metadata.

  • NoSQL (Cassandra/DynamoDB)

    • High‑write message storage with partition keys for fast retrieval.

  • Redis
    • In‑memory storage for presence, typing indicators, and hot messages.

If you’re brushing up on algorithms and data modeling, our essential DSA and web development courses cover these fundamentals.

When to Use SQL vs NoSQL?

Use Case

Recommended Database

Rationale

User Profiles

SQL

Transactional integrity and relations

Chat Messages

NoSQL

High throughput and horizontal scalability

Conversation Indexes

SQL

Consistent metadata joins

Presence Tracking

Redis

Low‑latency reads and writes

File Metadata

SQL

Referential integrity

For a deep dive into data structures, check out our DSA fundamentals course.

High‑Level Architecture and Message Flow

System Component Placement

  • API Gateway: Authentication and routing
  • Microservices: Authentication, Messaging, Notifications, Presence, Media Storage
  • WebSocket Servers: Real‑time message push
  • Message Broker (Kafka): Durable queueing for offline delivery
  • Load Balancers (Nginx): Distribute HTTP and WebSocket traffic
  • Caching (Redis): Hot data and presence
  • CDN/S3: Media storage and delivery

How are messages routed?

  1. User A sends a message to the API Gateway.

  2. Message Service checks recipient presence.

    • Online: Push via WebSocket.

    • Offline: Queue in Kafka.

  3. Delivery: Update status from sent → delivered → read.

  4. Group Chats: Fan‑out to all online members or queue for offline users.

Our system design crash course covers similar patterns end to end for a quicker path to building this yourself.

Enabling Real‑Time Communication

What is Polling?

Clients periodically request new messages. High latency and wasted requests make this unsuitable at scale.

What is Long Polling?

Servers hold connections open until new data arrives, reducing empty responses but still requiring reconnects.

Why WebSockets?

Persistent two‑way connections deliver instant updates with minimal overhead. Fallbacks to long polling can handle firewalls or unsupported clients.

For real‑time application best practices, see our web development bootcamp.

Data Retention and Cleanup

  • Archival: Move messages older than one year to cold storage (S3/Glacier).

  • Soft Deletion: Mark records for removal, then purge in background jobs (Airflow/Kafka Streams).

  • Media Cleanup: Separate lifecycle policies for attachments in CDN/S3.

Partitioning, Replication, and Caching

  • Sharding: Horizontal partitioning by user or conversation ID.

  • Replication: Leader‑follower for writes and reads, multi‑region for global availability.

Cache Strategies: Redis for recent messages, CDN for static assets.

Understanding these patterns is critical for distributed systems engineers preparing for roles at top tech companies.

Future Enhancements: Voice and Video Calling

  • WebRTC for peer‑to‑peer media streams.
  • TURN/STUN Servers for NAT traversal.
  • Selective Forwarding Unit (SFU) for group calls.
  • Adaptive Bitrate (ABR) for network QoS.
  • End‑to‑End Encryption with DTLS‑SRTP.

Common Q&A on Messenger’s System Design

How does Messenger handle real‑time messaging?

 WebSockets provide low‑latency delivery. For additional context, review our Netflix DSA interview guide on streaming systems.

Messages land in a NoSQL store optimized for writes, then a SQL store tracks metadata. Prepare with our top 20 DSA interview questions.

 A Notification Service uses FCM/APNs with Kafka to queue messages. If you’re interviewing at Meta, check our Meta/Facebook DSA prep.

 Load balancers, sharding, and multi‑region replication ensure users connect to the nearest data center.

 Soft deletion flags records before background cleanup. For enterprise‑grade cleanup strategies, see our Atlassian DSA interview guide.

This insightful blog post is authored by Somya Rajput, who brings his expertise and deep understanding of the topic to provide valuable perspectives.

DSA, High & Low Level System Designs

Buy for 60% OFF
₹25,000.00 ₹9,999.00

Accelerate your Path to a Product based Career

Boost your career or get hired at top product-based companies by joining our expertly crafted courses. Gain practical skills and real-world knowledge to help you succeed.

Reach Out Now

If you have any queries, please fill out this form. We will surely reach out to you.

Contact Email

Reach us at the following email address.

arun@getsdeready.com

Phone Number

You can reach us by phone as well.

+91-97737 28034

Our Location

Rohini, Sector-3, Delhi-110085

WhatsApp Icon

Master Your Interviews with Our Free Roadmap!

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.