Course Content
Data Structures & Algorithms
Full Stack Web Development
Understanding and playing with DOM (Document Object Model)
0/2
MERN project
0/2
Low Level System Design
LLD Topics
High Level System Design
Fast-Track to Full Spectrum Software Engineering
Database Design & Schema for Newsfeed System

The database design for a Newsfeed system is crucial to ensure that it can efficiently handle the large amount of data, including user profiles, posts, interactions (likes, comments, shares), and notifications. We will also discuss how to structure the schema to ensure the system can scale and perform well under high load.

 

1. Database Design Overview

For a Newsfeed system, we will typically use a Relational Database Management System (RDBMS), such as MySQL or PostgreSQL, for storing structured data like user information, posts, comments, and interactions. However, you can also use NoSQL databases like MongoDB or Cassandra if your system requires greater scalability for unstructured data or if you’re optimizing for high availability.

 

We’ll divide the schema into several core entities:

 

  • User
  • Post
  • Comment
  • Like
  • Share
  • Follow (for user following behavior)
  • Notification
  • Trending Topics

 

2. Core Entities and Schema

Here’s a simple Relational Database Schema for a Newsfeed system:



User Table

The User table stores basic information about each user who interacts with the system.

 

Column Data Type Description
user_id INT Primary Key, Unique User ID
username VARCHAR(255) Username, Unique
email VARCHAR(255) User’s email address, Unique
password_hash VARCHAR(255) Hashed password
first_name VARCHAR(255) User’s first name
last_name VARCHAR(255) User’s last name
profile_pic VARCHAR(255) URL to the user’s profile picture
created_at TIMESTAMP Date the user registered
updated_at TIMESTAMP Date the user profile was last updated


Post Table

The Post table stores content that users post, along with metadata like timestamps and visibility (public/private).

 

Column Data Type Description
post_id INT Primary Key, Unique Post ID
user_id INT Foreign Key to User table
content TEXT The content of the post
media_url VARCHAR(255) URL to any media (image/video) attached
created_at TIMESTAMP Timestamp of post creation
updated_at TIMESTAMP Timestamp of last post update
visibility ENUM(‘public’, ‘private’) Visibility status of the post
hashtags TEXT Optional: A list of hashtags for the post


Comment Table

The Comment table stores all comments made on posts.

 

Column Data Type Description
comment_id INT Primary Key, Unique Comment ID
post_id INT Foreign Key to Post table
user_id INT Foreign Key to User table
content TEXT The comment content
created_at TIMESTAMP Timestamp when the comment was created


Like Table

The Like table stores likes for each post.

 

Column Data Type Description
like_id INT Primary Key, Unique Like ID
post_id INT Foreign Key to Post table
user_id INT Foreign Key to User table
created_at TIMESTAMP Timestamp when the like was created

 

Share Table

The Share table stores shares of posts. This allows tracking of how many times a post has been shared by users.

 

Column Data Type Description
share_id INT Primary Key, Unique Share ID
post_id INT Foreign Key to Post table
user_id INT Foreign Key to User table
shared_at TIMESTAMP Timestamp when the share occurred


Follow Table

The Follow table stores user follow relationships (one user follows another).

 

Column Data Type Description
follow_id INT Primary Key, Unique Follow ID
follower_id INT Foreign Key to User table (follower)
followed_id INT Foreign Key to User table (followed)
created_at TIMESTAMP Timestamp when the follow action occurred


Notification Table

The Notification table stores notifications for users when someone likes, comments, or shares their posts, or when they are mentioned.

 

Column Data Type Description
notification_id INT Primary Key, Unique Notification ID
user_id INT Foreign Key to User table (the recipient)
type ENUM(‘like’, ‘comment’, ‘share’, ‘mention’) Type of notification
message TEXT The notification message
post_id INT Foreign Key to Post table (optional)
created_at TIMESTAMP Timestamp when the notification was created


Trending Topics Table

The Trending Topics table stores popular hashtags, keywords, or topics.

 

Column Data Type Description
topic_id INT Primary Key, Unique Topic ID
topic_name VARCHAR(255) Name of the trending topic (e.g., hashtag)
post_count INT The number of posts containing this topic
created_at TIMESTAMP Timestamp when the topic was first tracked


3. Indexing and Performance Optimization

Given that the Newsfeed system involves frequent querying for posts and interactions (likes, comments), we need to consider indexing on certain columns to improve performance.

 

Indexes on the Post Table:

 

  • Index on user_id: To quickly retrieve all posts made by a specific user.
  • Index on created_at: For efficient querying of posts based on time (e.g., to get the latest posts).
  • Index on visibility: For filtering posts based on their visibility (public/private).

 

Indexes on the Like/Comment/Share Tables:

 

  • Index on post_id: To quickly retrieve all likes/comments/shares for a specific post.
  • Index on user_id: For fetching all likes/comments by a specific user.

 

Indexes on the Follow Table:

 

  • Index on follower_id: To get all users following a specific user.
  • Index on followed_id: To get all followers of a specific user.

 

Indexes on the Notification Table:

 

  • Index on user_id: To get all notifications for a specific user.
  • Index on created_at: For fetching recent notifications.


4. Example Schema for a Relational Database

Here’s a high-level SQL example for creating a simplified schema for some of the above tables:

CREATE TABLE User (
user_id INT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(255) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
first_name VARCHAR(255),
last_name VARCHAR(255),
profile_pic VARCHAR(255),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);

CREATE TABLE Post (
post_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
content TEXT,
media_url VARCHAR(255),
visibility ENUM('public', 'private'),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES User(user_id)
);

CREATE TABLE Comment (
comment_id INT AUTO_INCREMENT PRIMARY KEY,
post_id INT NOT NULL,
user_id INT NOT NULL,
content TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (post_id) REFERENCES Post(post_id),
FOREIGN KEY (user_id) REFERENCES User(user_id)
);

CREATE TABLE Like (
like_id INT AUTO_INCREMENT PRIMARY KEY,
post_id INT NOT NULL,
user_id INT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (post_id) REFERENCES Post(post_id),
FOREIGN KEY (user_id) REFERENCES User(user_id)
);

CREATE TABLE Notification (
notification_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
type ENUM('like', 'comment', 'share', 'mention'),
message TEXT,
post_id INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES User(user_id),
FOREIGN KEY (post_id) REFERENCES Post(post_id)
);



5. Optimizing for Scalability

To ensure the system can scale as the user base grows:

 

  • Sharding/Partitioning: You may want to partition tables like Post and Comment based on user ID or geographic location to improve read and write performance.

 

  • Caching: Use a caching layer (e.g., Redis) to cache frequently accessed data such as posts, comments, and likes, especially for high-traffic users.

 

  • NoSQL Databases: For certain high-velocity and unstructured data, such as activity streams, you can use NoSQL databases like Cassandra or MongoDB to handle massive write throughput and scale horizontally.
0% Complete
WhatsApp Icon

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.