Linkedin Interview Questions
- DSA
- LLD
- HLD
Q1: Connection Distance Problem
This problem represents users as nodes and their friendships as edges. The goal is to find how many “hops” or steps are required for one user to reach another through connections. Since all edges have equal weight, Breadth-First Search (BFS) is the best algorithm. You start from the source user, explore all immediate connections, then their connections, and so on, until the target is found. BFS ensures the shortest path in unweighted graphs.
Example:
Bob → Alice → Frank
Distance = 2
Bob → John → Jenny → Lucy (longer)
Shortest path = 2.
Q2: Number of Islands
This problem counts distinct connected groups of land in a 2D grid where 1 represents land and 0 represents water. You can treat each cell as a graph node and adjacent land cells (up, down, left, right) as edges. Using Depth-First Search (DFS) or Breadth-First Search (BFS), you “sink” or mark visited land to avoid recounting. Every time you find an unvisited land cell, it represents a new island.
Example:
Grid:
1 1 0
0 1 0
1 0 1
Islands = 3.
Q3: Check if a Graph is Bipartite
A graph is bipartite if you can color it using two colors such that no adjacent nodes share the same color. BFS or DFS is used to assign alternate colors to neighbors. If at any point a neighbor already has the same color, the graph is not bipartite. This also works for disconnected graphs by checking each component.
Example:
Graph: 0–1–2–3–0 → Even cycle → Bipartite
Graph: 0–1–2–0 → Odd cycle → Not bipartite
Q4: Course Schedule II
You are given a list of courses and prerequisites. The problem is to determine a valid order in which all courses can be completed. The graph is directed, where an edge A → B means A depends on B. Perform topological sorting using Kahn’s algorithm (BFS) or DFS. Start with courses that have no prerequisites, remove them, and reduce dependency counts of other courses.
Example:
Input: 2 courses, prerequisites: [[1,0]]
Order: 0 → 1
Q5: Graph Shortest Path + Print Path
To find the shortest path between two nodes in an unweighted graph, BFS is ideal. As you explore neighbors, maintain a parent map for each visited node. When you reach the destination, reconstruct the path by walking backward through parents.
Example:
Edges: A→B, B→C, C→D
Path: A → B → C → D
Q6: Number of Connected Components
This problem counts how many separate groups of connected nodes exist. You can use DFS or BFS to explore one group at a time. Each time you find an unvisited node, it forms a new component, and you explore all nodes connected to it.
Example:
Graph:
1–2–3 and 4–5
Connected components = 2
Q7: Alien Dictionary
Given a sorted dictionary of words from an alien language, determine the order of characters. This is a topological sorting problem. Compare adjacent words and find the first differing character to establish ordering between characters. Build a directed graph and perform topological sort.
Example:
Words: [“baa”,”abcd”,”abca”,”cab”,”cad”]
Order may be: b d a c
Q8: Graph Valid Tree
A valid tree must have:
- Exactly n–1 edges
- Must be fully connected (only one component)
Use DFS/BFS to check connectivity and confirm no cycles exist.
Example:
Edges: [[0,1],[0,2],[0,3]] → Valid tree
Edges: [[0,1],[1,2],[2,0]] → Has cycle → Not a tree
Q9: Trapping Rain Water
Given heights of bars, compute how much water can be trapped. Use two-pointer technique by maintaining left-max and right-max values. Water stored at each index is min(leftMax, rightMax) – height[i].
Example:
Input: [0,1,0,2,1,0,1,3]
Output: 6 units
Q10: Sliding Window Maximum
You must find the maximum in every sliding window of size k. Use a deque that stores indexes in decreasing order of values. The front of the deque always represents the maximum for the current window.
Example:
nums = [1,3,-1,-3,5,3,6,7], k=3
Output: [3,3,5,5,6,7]
Q11: Rotate a Linked List by N Nodes
Rotating a linked list means shifting the last N nodes to the front while maintaining their original order. To solve this efficiently, first compute the list length, then connect the tail to the head to form a circular linked list. Next, determine the new breaking point, which lies at position (length – N % length). Traverse to that node, break the circle, and update the head. This ensures the rotation happens with only one pass after calculating length.
Example:
List: 1 → 2 → 3 → 4 → 5, rotate by 2
Result: 4 → 5 → 1 → 2 → 3
Q12: Merge Two Sorted Linked Lists
This problem requires merging two sorted linked lists into one sorted list. Use two pointers, each pointing to a list. Compare values and attach the smaller node to a new merged list. Move the pointer whose node was chosen. When one list becomes empty, attach the remainder of the other list. Time complexity is O(m+n).
Example:
List1: 1 → 3 → 5
List2: 2 → 4 → 6
Merged: 1 → 2 → 3 → 4 → 5 → 6
Q13: Linked List Cycle Detection
Detecting a cycle efficiently uses Floyd’s Tortoise and Hare algorithm. Two pointers traverse the list—slow moves one step, fast moves two. If there’s a cycle, both pointers eventually meet inside it. If fast reaches null, the list has no cycle. This approach uses O(1) space and O(n) time.
Example:
1 → 2 → 3 → 4 → 2 (cycle back to node 2)
Pointers eventually meet inside the cycle → cycle exists.
Q14: Reverse Linked List
To reverse a linked list, iterate through the nodes and reverse the direction of the links one by one. Use three pointers: prev, curr, next. At each step, point curr.next to prev, then advance all pointers. Continue until all nodes are reversed.
Example:
Input: 1 → 2 → 3
Output: 3 → 2 → 1
Q15: Palindrome Linked List
Check whether a linked list reads the same forward and backward. Use two-pointer technique to find the middle, reverse the second half, and then compare both halves node by node. After comparison, you can optionally restore the list.
Example:
1 → 2 → 3 → 2 → 1 → palindrome → true
1 → 2 → 3 → 4 → false
Q16: Intersection of Two Linked Lists
Two linked lists may intersect at some node. Use two pointers: traverse each list, and when a pointer reaches the end, redirect it to the other list’s head. If lists intersect, pointers meet at the intersection. Otherwise they meet at null.
Example:
A: 1 → 2 → 3 → 8 → 9
B: 4 → 5 → 8 → 9
Intersection at node 8
Q17: Remove Nth Node From End
Use two pointers: move fast N steps ahead, then move both slow and fast together until fast reaches the end. Now slow is at the node just before the one to remove. Adjust pointers to skip the target.
Example:
List: 1,2,3,4,5, n=2
Remove 4 → Result: 1,2,3,5
Q18: Reorder List
Reordering rearranges list nodes in pattern: first → last → second → second last…
Steps:
- Find middle
- Reverse second half
Merge both halves alternately
Example:
1 → 2 → 3 → 4 → 5
Reordered: 1 → 5 → 2 → 4 → 3
Q19: Kadane’s Algorithm (Maximum Subarray Sum)
Kadane’s algorithm finds the contiguous subarray with the largest sum in O(n) time. It maintains a running sum that resets to zero whenever it becomes negative, because a negative prefix would only reduce future results. A global variable tracks the highest sum seen so far. This method works efficiently for arrays with both positive and negative numbers.
Example:
Input: [-2,1,-3,4,-1,2,1,-5,4]
Best subarray: [4, -1, 2, 1]
Maximum sum = 6
Q20: Can Place Flowers
You’re given a flowerbed array containing 0s (empty) and 1s (planted). You must determine whether n new flowers can be planted without violating the rule that flowers cannot be adjacent. Loop through the array and check if the current spot and its neighbors are empty. If so, plant a flower (mark 1) and reduce n.
Example:
Input: [1,0,0,0,1], n = 1
We can plant at index 2 → return true
Q21: Max Consecutive Ones III
You want the longest contiguous subarray of 1s, but you’re allowed to flip up to k zeros into ones. Using the sliding window technique, expand the window to the right while counting zeros. If zeros exceed k, shrink the window from the left until zeros become valid again. Track the largest window seen.
Example:
Input: [1,1,0,0,1,1,1], k = 1
Max consecutive = 5
Q22: Maximum Length of Consecutive 1s With Flips
This is a general version of the above problem — you can flip some zeros (not necessarily limited to exactly k). Using two pointers, we keep expanding right and tracking zero count. When zero count exceeds allowed flips, move left pointer until valid. The sliding window gives O(n) time.
Example:
Input: [1,0,1,1,0,1], flips = 1
Max length = 4
Q23: Two-Pointer Medium Problem
Two-pointer problems involve moving one pointer from the start and another from the end (or sliding both inside). These are used in pair-sum problems, sorting-based problems, or optimizing subarrays.
Example:
Find a pair sum = 10 in [1,3,4,6,7]:
3 + 7 = 10 (found)
Q24: Longest Substring Without Repeating Characters
Use sliding window + hashmap to track characters. Expand right pointer while characters are unique. Once a duplicate is found, move left pointer until window becomes valid again. Track max window size.
Example:
Input: “abcabcbb”
Longest substring = “abc” → 3
Q25: Product of Array Except Self
The goal is to compute an array where each element equals the product of all other array elements, without using division. Use prefix multiplication (left products) and suffix multiplication (right products). Final answer for each index = left[i] * right[i].
Example:
Input: [1,2,3,4]
Output: [24,12,8,6]
Q26: Container With Most Water
Given heights of vertical lines, find two lines that trap the maximum water. Use two pointers: left at start, right at end. Calculate area = min(height[left], height[right]) × width. Move inward the pointer with the smaller height.
Example:
Input: [1,8,6,2,5,4,8,3,7]
Max water = 49
Q27: 3Sum
Find all distinct triplets that sum to zero. Sort array, fix one element, and use two pointers to find complementary pairs. Skip duplicates to avoid repeating triplets.
Example:
Input: [-1,0,1,2,-1,-4]
Output: [[-1,-1,2], [-1,0,1]]
Q28: Find K Closest Elements
Given a sorted array, find K elements closest to X. Use binary search to find X’s position, then expand two pointers outward to collect closest values.
Example:
Input: [1,2,3,4,5], X = 3, K = 4
Output: [1,2,3,4]
Q29: Kth Largest Element in an Array
Use a min-heap of size K. Insert elements; if heap grows beyond K, remove smallest. The top of heap becomes the Kth largest element.
Example:
Input: [3,2,3,1,2,4,5,5,6], k=4
Output: 4
Q30: Top K Frequent Elements
Count frequencies using hashmap, then use a max-heap or bucket sort to extract the K most frequent numbers.
Example:
Input: [1,1,1,2,2,3], k=2
Output: [1,2]
Q31: Meeting Rooms II
Given meeting intervals, find the minimum number of rooms required. Sort by start time and use a min-heap to track ending times. If a meeting starts before the earliest ending meeting ends, allocate new room.
Example:
[[0,30],[5,10],[15,20]] → rooms = 2
Q32: Spiral Matrix (Expanded)
The Spiral Matrix problem requires traversing a matrix in circular layers. You start at the top-left corner and move right across the top row, then down the rightmost column, left across the bottom, and finally up the leftmost column. After completing one layer, shrink the boundaries (top++, bottom–, left++, right–) and repeat until all elements are printed. The challenge lies in maintaining correct boundaries and avoiding duplicate printing. This pattern is useful in image processing, simulations, and matrix visualization.
Example:
Matrix: 1 2 3 4 5 6 7 8 9
Output: [1,2,3,6,9,8,7,4,5]
Q33: Jump Game
Jump Game asks whether you can reach the final index of an array, where each element represents the maximum jump length from that position. A greedy approach is most efficient: track the farthest index you can reach (maxReach). Iterate through the array—if your current index is ever greater than maxReach, you’re stuck and must return false. Otherwise, update maxReach = max(maxReach, i + nums[i]). If maxReach eventually reaches or passes the final index, the answer is true.
Example:
[2,3,1,1,4] → You can hop all the way → true
[3,2,1,0,4] → Stuck at index 3 → false
Q34: Unique Paths
You are given an m×n grid and must find how many distinct ways a robot can travel from the top-left corner to the bottom-right corner, moving only down or right. The dynamic programming solution builds a table where each cell stores the total number of ways to reach it:
Ways[i][j] = Ways[i–1][j] + Ways[i][j–1].
The first row and first column are initialized to 1 because there’s only one way to reach those cells. This classic combinatorial problem also has a direct formula using combinations: C(m+n-2, m-1).
Example:
Grid: 3×7
Total unique paths = 28
Q35: Climbing Stairs
This problem asks: “How many ways can you climb n stairs if you can take either 1 or 2 steps at a time?” Each step can come from either (n−1) or (n−2), making the solution identical to the Fibonacci sequence.
Ways(n) = Ways(n−1) + Ways(n−2)
Base cases: Ways(1)=1, Ways(2)=2.
You can use either dynamic programming or simple iterative Fibonacci computation for O(n) time and O(1) space. This problem teaches recurrence relations, DP transitions, and mathematical patterns.
Example:
n = 5
Ways = 8 (1-1-1-1-1, 1-1-1-2, 1-2-2, 2-1-2, etc.)
Q36: Minimum Path Sum
You are given a grid with positive numbers. You must reach the bottom-right cell by moving only right or down, minimizing the total sum along the path. Use dynamic programming:
dp[i][j] = grid[i][j] + min(dp[i–1][j], dp[i][j–1])
Initialize the first row and first column by accumulating sums since there’s only one path to them. This problem reinforces grid DP fundamentals.
Example:
Grid: 1 3 1, 1 5 1, 4 2 1
Optimal path: 1 → 3 → 1 → 1 → 1
Minimum sum = 7
Q37: Decode Ways
You decode a digit string where ‘1’ → ‘A’, ‘2’ → ‘B’, …, ’26’ → ‘Z’. The challenge arises from handling zeroes and deciding which digits can form valid letters. Use dynamic programming:
dp[i] = number of ways to decode substring ending at i.
If s[i] is valid (1–9), add dp[i−1].
If s[i−1..i] is valid (10–26), add dp[i−2].
Zeroes require careful handling—‘0’ cannot stand alone.
Example:
“226” → valid decodings: “BZ”, “VF”, “BBF”
Output = 3
Q38: Search in Rotated Sorted Array
This problem requires locating a target number in a rotated sorted array—an array that was originally sorted but rotated at some pivot. The key insight is that at least one half of the array is always sorted. Using modified binary search, check whether the left half or right half is sorted. Then determine if the target lies within that sorted half. If yes, narrow search to that half; otherwise, search the opposite half. Repeat until the element is found or the range collapses. This ensures O(log n) time.
Example:
Array: [4,5,6,7,0,1,2], Target = 0
Output index → 4
Q39: Validate Binary Search Tree
A valid BST must satisfy:
- Left subtree values < node value
- Right subtree values > node value
- Both subtrees must also be valid BSTs
Use DFS with min and max bounds to ensure each node fits valid range. Alternatively, do an inorder traversal; a BST’s inorder output must be strictly increasing. If any value violates the ordering, the tree is invalid. Common interview mistake: only checking immediate children—this fails deeper violations.
Example: - Valid BST: 2 1 3
- Invalid BST: 5 1 4
3 6 (3 < 5 violates)
Q40: Binary Tree Level Order Traversal
This BFS-based problem prints nodes level by level. Use a queue: push the root, then repeatedly pop nodes and push their children. Group nodes by level by measuring the queue’s size at each iteration. This traversal is commonly used in shortest path tree problems, hierarchy processing, serialization, and understanding tree structure.
Example:
Tree:
1
2 3
4 5 6
Output: [[1],[2,3],[4,5,6]]
Q41: Serialize & Deserialize Binary Tree
Serialization converts a tree into a string so it can be stored or transferred. Deserialization reconstructs the tree. The common method is preorder traversal with null markers (“null” or “#”) to preserve structure.
Example serialization: 1,2,null,null,3,4,null,null,5,null,null
For deserialization, read the string token by token—construct nodes recursively, creating actual tree nodes or nulls as needed.
This problem tests understanding of tree traversal, recursion, and pointer reconstruction.
Example:
Input tree: 1,2,3
Serialized → “1,2,null,null,3,null,null”
Deserializing returns the same tree.
Q42: Lowest Common Ancestor (LCA) of Binary Tree
LCA is the deepest node from which both target nodes are descendants. Using DFS, recursively search in left and right subtrees.
- If both sides return non-null, the current node is the LCA.
- If only one side returns non-null, bubble that up.
- If both return null, propagate null.
This approach works for general binary trees—not only BSTs.
Example:
Tree:
3
/ \
5 1
/ \ / \
6 2 0 8
LCA(5,1) = 3
LCA(6,2) = 5
Q43: Find Leaves of Binary Tree
The idea is to repeatedly remove leaves from the tree until the tree is empty. Instead of modifying the tree, compute node heights using DFS. A leaf has height 0. A parent has height = 1 + max(left height, right height). Group nodes by their height. Nodes with same height form one “round” of leaf removals.
Example:
Tree:
1
2 3
Heights: 2 → 0, 3 → 0, 1 → 1
Output: [[2,3],[1]]
Q44: K Points Near Origin
Given points (x,y), compute distance to origin using squared distance: x² + y². Use a max heap of size k:
Push each point
If heap grows beyond k, pop the farthest point
This ensures only the k closest points remain.
Complexity: O(n log k).
Example:
Input: Points = [(3,3),(5,-1),(-2,4)], k = 2
Distances: 18, 26, 20
Closest → [(3,3), (-2,4)]
Q45: Binary Search Tree Operations
A BST supports:
Insert
Search
Delete
Traversals
Insertion follows BST rules—left < node < right. Searching is O(log n) if balanced, but O(n) in worst case. Deletion has three cases:
- Node is leaf → remove directly
- One child → replace node with child
Two children → replace with inorder successor (smallest in right subtree)
BSTs support inorder traversal to get sorted output.
Example:
Insert: 5,3,7,4
Inorder → [3,4,5,7]
Q46: Maximum Depth of Binary Tree
Maximum depth is the number of levels from root to deepest leaf. Using DFS, compute:
depth(node) = 1 + max(depth(left), depth(right))
Depth is essential for evaluating tree complexity, recursion limits, and balancing.
Example:
Tree:
1
2 3
4
Max depth = 3
Q47: Build Tree from Preorder & Inorder Traversal
Given preorder (root-left-right) and inorder (left-root-right) sequences, reconstruct the binary tree. The first preorder element is the root. Locate that in inorder to determine left and right subtree sizes. Recursively repeat for both sides.
This tests recursion, slicing, and understanding of traversal properties.
Example:
Preorder: [3,9,20,15,7]
Inorder: [9,3,15,20,7]
Reconstructed tree root = 3
Q48: Valid Parentheses
This problem checks whether a string containing parentheses (), [], {} is properly balanced. A stack is the ideal choice because it follows the LIFO principle. For every opening bracket, push it onto the stack. For every closing bracket, check if the stack top contains the matching opening bracket—if not, the string is invalid. At the end, if the stack is empty and no mismatches were found, the string is valid. This problem is commonly used to test stack fundamentals, parsing logic, and error-checking mechanisms.
Example:
Input: “({[]})” → Valid
Input: “([)]” → Invalid
Q49: Exclusive Time of Functions
Given logs of function start and end times, compute each function’s exclusive execution time—excluding time spent in other functions called inside it. The solution uses a stack to track which function is currently running. When a function starts, push it with a timestamp. When it ends, pop it and compute the difference, subtracting time spent in child calls. Add this exclusive time to its ID. The problem teaches nested execution handling, timestamp interpretation, and stack simulation.
Example:
Logs:
0:start:0, 1:start:2, 1:end:5, 0:end:6
Exclusive: func0 = 3, func1 = 4
Q50: Nested Stack with Inclusive–Exclusive Time
This medium-hard problem involves processing nested segments with timestamps, where some are inclusive and others exclusive. A stack maintains currently active segments, and each time a segment starts, the previous one is paused. When a segment ends, its duration is computed and subtracted from the parent segment’s timeline. This models cases like nested logging, profiling, or task scheduling. Handling inclusive vs exclusive ranges requires careful comparison of start and end boundaries.
Example:
A starts at 0, B inside A starts at 2 → end at 5
A runs from 0 to 2 + from 5 to end
B gets full duration (3 units)
Q51: LRU Cache
LRU Cache evicts the least recently accessed item when capacity is full. To support O(1) get and put operations, use a doubly linked list + a hashmap. The doubly linked list tracks usage order (most recent at front, least recent at end). On every access, move the node to the front. When inserting a new key while full, remove the tail node (least recently used). This combines fast lookup with predictable eviction patterns.
Example:
Cache capacity = 2
Put(1,1), Put(2,2), Get(1) → reorder
Put(3,3) → evicts 2
Cache = {1,3}
Q52: LFU Cache
LFU evicts the item with the lowest access frequency. If two items share the same frequency, evict the least recently used among them. Use a combination of:
- Hashmap for key → (value, frequency)
- Hashmap of frequency buckets
Doubly linked lists inside each frequency bucket
When accessing a key, increase its frequency and move it to the correct bucket. LFU is harder than LRU because two conditions (frequency + recency) must be satisfied.
Example:
Capacity 2
Put(1), Put(2)
Get(1) makes frequency: 1→2, 2→1
Put(3) → evicts key 2 (lowest freq)
Q53: Implement Trie
A trie stores strings in a prefix-based tree structure. Each character represents a node, and complete words have a terminal marker. Tries support very fast prefix queries by walking down child pointers. Useful for autocomplete, dictionary operations, spell-checking, and search engines.
Example:
Insert: “cat”, “car”
Search for “ca” → prefix exists
Search “cat” → word exists
Search “cap” → no
Q54: Add & Search Word Structure
This extends a trie to support wildcard searching. The dot . matches any character. During search, whenever a dot appears, recursively explore all children nodes at that level. This requires DFS in the trie.
Example:
Insert: “bad”, “dad”, “mad”
Search “.ad” → true
Search “b..” → true
Search “..b” → false
Q55: Word Break
Given a string and a dictionary of words, determine if the string can be segmented into valid words. Use dynamic programming:
dp[i] = true if substring ending at i can be formed using dictionary words.
Try all possible breakpoints before i. If any valid split exists, mark dp[i] as true.
This problem teaches substring DP, dictionary lookups, and segmentation logic.
Example:
String: “leetcode”
Dict: {“leet”, “code”}
Result → true
Q56: Combination Sum
Combination Sum requires finding all unique combinations of numbers that add up to a target value. You can reuse the same number multiple times. Use backtracking with a recursive function that explores each candidate starting from a chosen index. At each step:
- If target becomes 0 → push current combination
- If target becomes negative → stop exploring
Otherwise continue exploring by reusing the same number
Sorting helps maintain consistency and avoid duplicates. This problem teaches tree exploration, pruning strategies, and constructing solution paths.
Example:
Candidates: [2,3,6,7], Target = 7
Output: [[2,2,3],[7]]
Q57: Permutations
To generate all permutations of a list, use backtracking or swapping-based recursion. Each recursion level selects one element to place at the current index while the rest remain available. You continue until the current arrangement reaches full length. Ensure every element is used exactly once per permutation. Time complexity is O(n×n!) because all possible arrangements are produced. This problem tests recursion depth, visited sets, and arrangement logic.
Example:
Input: [1,2,3]
Output includes:
[1,2,3], [1,3,2], [2,1,3], [3,2,1]
Q58: Subsets
This problem asks for all possible subsets, including the empty set and the full set. Use backtracking to either include or exclude each element. Alternatively, use bit manipulation from 0 to 2ⁿ−1, interpreting each bit as “include” or “exclude.” Backtracking gives clearer structure and is more intuitive. Subsets problems appear frequently in combinatorics, dynamic programming, and interview recursion questions.
Example:
Input: [1,2]
Output: [[], [1], [2], [1,2]]
Q59: N-Queens
The N-Queens problem asks you to place N queens on an N×N chessboard so that none attack each other (no shared row, column, or diagonal). Backtracking is used: place one queen per row and check column and diagonal conflicts. Maintain three sets (columns, diag1, diag2) to track unsafe positions. Each valid placement leads to deeper recursive calls. When a solution is found, capture the board configuration. N-Queens is a classic test of recursion, pruning, and constraint satisfaction.
Example:
N=4 solutions include:
.Q..
…Q
Q…
..Q.
Q60: Palindrome Partitioning
Partition a string such that every substring is a palindrome. Use backtracking: at each index, expand possible substrings and check if they are palindromes. If yes, recursively explore partitions from the next index. Efficient palindrome checking and memoization help reduce repeated computation. This problem is important for recursion, string manipulation, and dynamic partitioning.
Example:
Input: “aab”
Output: [[“a”,”a”,”b”],[“aa”,”b”]]
Q61: Generate Parentheses
You must generate all valid combinations of n pairs of parentheses. Use backtracking while tracking:
- How many ‘(‘ have been added
- How many ‘)’ have been added
You can only add ‘(‘ if count < n, and you can add ‘)’ only if right < left. This ensures balanced parentheses throughout construction. This problem teaches constraints-based recursion and well-formed structure generation.
Example:
Input: n = 3
Output:
[“((()))”,”(()())”,”(())()”,”()(())”,”()()()”]
Q62: Longest Increasing Subsequence
LIS finds the longest subsequence that strictly increases. Use two approaches:
- DP (O(n²)) — dp[i] = longest subsequence ending at i
- Binary search method (O(n log n)) using a “tails” array
The tails array stores the smallest possible ending values for subsequences of different lengths. Replace elements using binary search to maintain optimal growth. Understanding LIS is vital for advanced DP, patience sorting, and optimization problems.
Example:
Input: [10,9,2,5,3,7,101,18]
LIS = 4 → subsequence [2,3,7,18]
Q63: Coin Change
Coin Change asks for minimum coins needed to reach a given amount. Use dynamic programming where dp[x] represents the minimum coins to achieve value x. Initialize dp with infinity except dp[0]=0. For each coin, update dp for all reachable amounts. If dp[amount] remains infinity, no solution exists. This problem is foundational in DP, knapsack variants, and optimization.
Example:
Coins: [1,2,5], Amount = 11
Result = 3 (5 + 5 + 1)
Q64: House Robber
You cannot rob adjacent houses; maximize profit. DP solution:
dp[i] = max(dp[i-1], nums[i] + dp[i-2])
Thus, for each house, choose to rob it (and skip previous) or skip it (take best so far).
Example:
Input: [2,7,9,3,1]
Output: 12 (rob houses with 2,9,1)
Q65: Longest Common Subsequence
LCS finds the longest subsequence present in both strings. Use DP:
If chars match: dp[i][j] = dp[i-1][j-1] + 1
Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
LCS is crucial in diff tools, DNA sequence alignment, and version control.
Example:
Input: “abcde”, “ace”
Output = 3 (“ace”)
Q66: Kth Largest Element in an Array
This problem requires finding the Kth largest value in an unsorted array. The most efficient approach is using a min-heap of size K. Insert elements one by one; if the heap size exceeds K, remove the smallest element. This guarantees the heap always holds the K largest elements seen so far, with the smallest among them being the Kth largest. Time complexity is O(n log k), which is optimal for large inputs. Another approach is using Quickselect (a variant of Quicksort partitioning), achieving average O(n) time.
Example:
Input: [3,2,1,5,6,4], k=2
Output: 5 (the 2nd largest)
Q67: Top K Frequent Elements
This problem asks for the K elements with the highest frequency in an array. First count occurrences using a hashmap. Then use either a max heap (where each entry stores frequency and value) or bucket sort, where frequencies index buckets of numbers. Bucket sort runs in O(n) and is ideal when frequencies are bounded. This problem tests knowledge of hashing, heaps, and optimized sorting strategies.
Example:
Input: [1,1,1,2,2,3], k=2
Output: [1,2]
Q68: Merge Intervals
Given intervals, merge all overlapping ones. First sort by the start time. Then iterate through the list and compare each interval with the last merged interval. If they overlap, update the end boundary; if not, push the new interval to the result list.
This problem is heavily used in calendar merging, time-blocking systems, and event management platforms.
Example:
Input: [[1,3],[2,6],[8,10],[15,18]]
Merged: [[1,6],[8,10],[15,18]]
Q69: Insert Interval
This involves inserting a new interval into an already non-overlapping sorted list and merging where needed. Three stages occur:
- Add all intervals ending before the new interval
- Merge all overlapping intervals with the new one
- Append remaining intervals
This ensures correct placement and merging in O(n).
Example:
Input: intervals = [[1,3],[6,9]], new = [2,5]
Output = [[1,5],[6,9]]
Q70: Meeting Rooms
Meeting Rooms asks whether a person can attend all given meetings without overlap. You sort the intervals by start time and check if any meeting starts before the previous one ends. If so, overlapping occurs and attending all is not possible.
Example:
Input: [[0,30],[5,10],[15,20]] → Cannot
Input: [[7,10],[2,4]] → Possible
Q71: Meeting Rooms II
You must find the minimum number of rooms required to host all meetings. Sort meetings by start time and use a min-heap to track the earliest ending meeting. If the next meeting starts after the min end time, reuse the room; otherwise allocate a new one. This simulates real-world room scheduling systems.
Example:
Input: [[0,30],[5,10],[15,20]]
Output → 2 rooms
Q72: Minimum Number of Arrows to Burst Balloons
Each balloon is an interval representing where it can be burst. You want the minimum number of arrows such that each arrow shot through a coordinate destroys all overlapping balloons. Sort intervals by end time and greedily shoot arrows at the end of each interval, covering all balloons overlapping that point.
Example:
Input: [[10,16],[2,8],[1,6],[7,12]]
Output: 2 arrows
Q73: Sliding Window Maximum
This problem requires finding the maximum value within every sliding window of size k across an array. Using a naive approach of scanning each window takes O(nk), which is too slow. The optimal solution uses a deque that always keeps indexes of useful elements in decreasing order of value. The deque’s front always holds the maximum for the current window.
Example:
Input: [1,3,-1,-3,5,3,6,7], k=3
Output: [3,3,5,5,6,7]
Q74: Trapping Rain Water
You are given heights of bars and must find how much water can accumulate between them after rain. Using two pointers, track the highest bar from the left (leftMax) and right (rightMax). If the current left bar is lower, water trapped above it is leftMax – height[left]. If the right side is lower, use rightMax – height[right]. Move inward from the side with the lower bar. This classic problem teaches prefix max, suffix max, and two-pointer optimization.
Example:
Input: [0,1,0,2,1,0,1,3,2,1,2,1]
Output: 6
Q75: Spiral Matrix II
Instead of reading a matrix spirally, you generate an n×n matrix filled with numbers from 1 to n² in spiral order. You maintain boundaries (top, bottom, left, right) and fill layer by layer: move right, down, left, and up, then shrink boundaries. This problem reinforces understanding of spiral traversal patterns but applied during construction instead of reading.
Example: n=3
Output: 1 2 3, 8 9 4, 7 6 5
Q76: Binary Tree Zigzag Level Order Traversal
This problem is a variation of level-order traversal where nodes are visited level by level, but the direction alternates between left-to-right and right-to-left. A queue performs the BFS while a boolean flag controls the direction. After each level is processed, reverse the list if the direction is right-to-left. This zigzag pattern is often used in visualizing hierarchical data where directions alternate for readability.
Example:
Tree:
3
/ \
9 20
/ \
15 7
Output:
[[3], [20,9], [15,7]]
Q77: Symmetric Tree
A binary tree is symmetric if its left subtree mirrors the right subtree. Use recursion by comparing corresponding nodes: left-left vs right-right and left-right vs right-left. Both values and structure must match. Alternatively, use a queue for iterative comparison.
Example:
Tree:
1
/ \
2 2
/ \ / \
3 4 4 3
Output: true
Q78: Flatten Binary Tree to Linked List
Flatten the binary tree into a linked list following preorder traversal. You can do this in-place by using recursion to flatten left and right subtrees first. Then move the left subtree to the right, attach the original right subtree to the end of the moved left subtree, and nullify the left pointer. This problem tests pointer manipulation and recursion.
Example:
Tree: 1 → 2,3 → 4,5
Flattened list: 1 → 2 → 4 → 5 → 3
Q79: Path Sum
Given a binary tree and a target sum, determine if a root-to-leaf path exists such that the sum of node values equals the target. Use DFS by subtracting the node’s value from the sum and exploring children. A valid path must end at a leaf. This is commonly used in financial flow problems, path validations, and decision tree tracing.
Example:
Tree:
5
4 8
11 13 4
7 2 1
Target: 22
Path: 5→4→11→2 → true
Q80: Path Sum II
Unlike Path Sum I, this version requires finding all root-to-leaf paths that match the target sum. Use backtracking: maintain a current path list and total sum. When at a leaf, check if the sum matches; if yes, append the current path to results. On returning, remove the last node to backtrack.
Example:
Tree above, target 22
Output:
[[5,4,11,2]]
Q81: Path Sum III
This problem counts all possible paths (starting at any node, not just root) whose sum equals a given target. Use prefix-sum hashmap technique:
currentSum – target = previousSum
If previousSum appears in the map, a valid path exists. This allows checking paths in O(n).
Example:
Tree: [10,5,-3,3,2,null,11,…]
Target = 8
Output = 3 valid paths
Q82: Construct Binary Tree from Inorder & Postorder
The last value in postorder is the root. Find this in inorder to determine left and right subtree sizes. Recursively build right subtree first (because of postorder property), then left subtree. Managing indices correctly is key.
Example:
Inorder: [9,3,15,20,7]
Postorder: [9,15,7,20,3]
Root = 3
Q83: Populating Next Right Pointers in Each Node
Given a perfect binary tree, connect each node’s next pointer to its right neighbor. Use queue BFS or constant-space pointer traversal (connect left child to right child, and right child to next node’s left).
Example:
If level is 1→2→3
Output next pointers: 1→null, 2→3, 3→null
Q84: Balanced Binary Tree
A binary tree is height-balanced if for every node, the height difference between left and right subtrees is at most 1. Use DFS to compute height while simultaneously checking balance; if a subtree is unbalanced, propagate failure upward using -1 as a signal.
Example:
Balanced:
3
9 20
15 7
Unbalanced: adding a deeper left child.
Q85: Convert Sorted Array to Height Balanced BST
Use divide-and-conquer: the middle element becomes root, left half becomes left subtree, right half becomes right subtree. This produces minimal height, ensuring optimized search performance.
Example:
Array: [-10,-3,0,5,9]
Root = 0
Tree becomes balanced.
Q86: Convert Sorted Linked List to BST
Unlike arrays, linked lists do not provide random access. Use slow/fast pointers to find mid node (root), recursively build left and right subtrees. This ensures balanced BST creation from sorted list.
Example:
List: -10 → -3 → 0 → 5 → 9
Output: Balanced BST with root 0.
Q87: Binary Tree Right Side View
Return nodes visible from the right side. Use level-order traversal, always capturing the last node of each level. Alternatively, DFS right-first traversal with level tracking.
Example:
Tree:
1
2 3
5 4
Right view: [1,3,4]
Q88: Longest Palindromic Substring
This problem asks for the longest substring within a string that reads the same forward and backward. The typical solution uses the expand-around-center technique. Every character (and pair of characters) is treated as the center of a potential palindrome, expanding outward while the characters match. This gives O(n²) time and O(1) space. Another approach uses dynamic programming, where dp[i][j] indicates whether substring from i to j is a palindrome. Manacher’s Algorithm solves it in O(n), but expand-around-center is preferred in interviews for clarity.
Example:
Input: “babad”
Output: “bab” or “aba”
Q89: Regular Expression Matching
You need to determine whether a string matches a pattern containing . (matches any character) and * (matches zero or more of the preceding element). This problem requires dynamic programming. dp[i][j] means: does s[0..i] match p[0..j]?
Key rules:
- If characters match or pattern has ., dp[i][j] = dp[i-1][j-1]
- If pattern has *, two choices:
- Use zero occurrences → dp[i][j] = dp[i][j-2]
- Use one or more → if characters match, dp[i][j] |= dp[i-1][j]
- Use zero occurrences → dp[i][j] = dp[i][j-2]
This problem teaches regex engines, DP transitions, and greedy pitfalls.
Example:
Input: s = “aab”, p = “c*a*b”
Output: true
Q1. Design a Parking Lot System
A parking lot system manages vehicles entering, parking, and exiting across multiple floors. The design includes classes like ParkingLot, Floor, ParkingSpot, Vehicle, Ticket, and Payment. Each floor contains a set of spots categorized by type—compact, large, or handicapped. When a car enters, a Ticket is issued with timestamp and allocated spot. On exit, the system calculates charges based on duration and spot type. The system must track availability in real time and update counts efficiently. Optional modules include automated barriers, license-plate recognition, and digital payment support. Scalability includes adding more floors or spot types without major changes.
Q2. Design BookMyShow / Movie Ticket Booking System
This system manages theaters, movies, showtimes, and seat bookings. Key classes include Movie, Theater, Screen, Show, Seat, Booking, and Payment. A screen contains seats arranged in rows; a show links a movie with a screen and time. Users browse movies, select a show, view available seats, and reserve them. A locking mechanism temporarily locks seats during checkout to prevent double-booking. The system must support cancelations, seat availability refresh, multiple seat types (VIP, recliner), and regional scheduling. Scalability involves distributed seat-locking, caching movie listings, and supporting peak traffic during blockbuster releases.
Q3. Design an Elevator System
An elevator system manages multiple elevators serving requests from different floors. Core components include Elevator, Controller, FloorRequest, ElevatorState, and Direction enums. The controller decides which elevator should handle a request using algorithms like nearest-car or load balancing. Each elevator tracks its current floor, direction, queue of requests, door state, and weight capacity. When a request occurs, it gets added to an elevator’s queue and processed sequentially. Advanced features include overload sensors, maintenance mode, fire mode, and predictive dispatching for peak hours. Concurrency handling is important because multiple requests may arrive simultaneously.
Q4. Design a Library Management System
This system manages books, borrowers, inventory, and check-in/check-out processes. Key classes include Book, BookItem (physical copy), Member, Librarian, Loan, Fine, and Catalog. A member can search for a book by title, author, or category. When borrowing, availability is checked and a loan entry is created with due date. Overdue books generate fines. The catalog organizes books by metadata and enables fast search. Librarians can add/remove books, block members, and manage reservations. Extensions include digital books, notifications for due dates, and analytics dashboards for usage tracking.
Q5. Design an ATM System
ATM system includes ATM, Card, Account, Transaction, Bank, and CashDispenser classes. When a user inserts a card, the ATM authenticates through the bank server using PIN verification. Users perform operations like withdrawal, deposit, balance inquiry, and mini statements. Cash dispensing involves checking account balance, daily withdrawal limit, and available currency denominations. Each transaction is logged with timestamps for auditing. ATM must handle failure states like insufficient funds, card jam, and network failure. Scalability involves supporting multiple banks and implementing a switch server for routing requests.
Q6. Design a Chess Game
Chess design involves modeling the board, pieces, players, and moves. Core classes include Board, Square, Piece (abstract), and specific piece types King, Queen, Rook, etc. Each piece has movement rules. The game engine validates moves, checks for check/checkmate, and manages turns. Additional components include Game, Move, Player, and rule enforcement mechanisms (castling, en passant, pawn promotion). The system can support multiple game modes—local play, AI opponent, online multiplayer. Extensibility is critical, such as adding a chess engine evaluation module.
Q7. Design a Vending Machine
A vending machine handles item selection, money insertion, dispensing, and inventory management. Main components: Item, Inventory, PaymentHandler, Coin, Bill, Display, and VendingMachine controller. The workflow: user selects item → machine checks availability → user inserts money → machine validates denominations → dispense item → return change. PaymentHandler keeps track of accepted coins/bills and change availability. Machine must also handle failure cases such as insufficient change, out-of-stock items, or jammed products. Admin console allows restocking, collecting cash, and diagnostics.
Q8. Design a Hotel Booking System
This system includes Hotel, Room, Customer, Booking, Invoice, and SearchService. Rooms belong to types (deluxe, suite, standard) and have availability calendars. Customers search hotels by date, location, and room type. Booking locks the room for requested dates and generates an invoice. Admins can manage pricing, seasonal rates, promotions, and room facilities. Avoiding double bookings requires atomic reservation or distributed locking. Optional modules include cancellation policies, loyalty points, and room service integration.
Q9. Design an Online Shopping Cart
Components include Product, Cart, CartItem, User, Order, and Payment. Users add items to their cart, modify quantities, and checkout. Cart stores item details, quantity, pricing, discounts, and tax calculation. On checkout, an order is generated, and product inventory is reduced. The system must support coupon codes, multiple payment methods, session persistence, and abandoned cart reminders. Scalability includes caching product listings and handling concurrent stock updates.
Q10. Design a Restaurant Management System
This system manages tables, reservations, orders, kitchen processing, and billing. Classes include Table, Reservation, MenuItem, Order, KitchenDisplaySystem, Bill, and Staff. Customers can reserve tables or walk in. Waiters create orders linked to a table; the kitchen receives order tickets. Once food is prepared, waiters are notified. The billing system calculates taxes, service charges, and discounts. Extensions include online ordering, POS integration, and multi-branch synchronization.
Q11. Design a Ride-Sharing System
Model includes entities: Rider, Driver, Trip, Vehicle, LocationService, MatchingService, PricingService, Payment, and Dispatcher. Riders request rides with pickup/dropoff coordinates; the Dispatcher queries available drivers (via LocationService) and uses MatchingService to pick an optimal driver (nearest, ETA, rating, surge). PricingService computes fares (base + time + distance + surge). Trip lifecycle: requested → accepted → in-progress → completed; Payment handles billing and receipts. Important subsystems: driver/rider authentication, cancellation rules, driver incentives, and real-time telemetry. Scale by sharding regions, use geo-indexing (k-d tree or geohash), and ensure idempotent trip creation to avoid double matches.
Q12. Design a Social Media News Feed
Key components: User, Post, FollowGraph, FeedGenerator, Ranker, Cache, Storage, TimelineService, and Notification. Two common models: push (fan-out on write) and pull (fan-out on read). For high-scale systems, use hybrid: push for users with small follower counts, pull for heavy hitters. FeedGenerator composes posts, Ranker orders by recency, relevance, engagement, and personalization signals. Use Redis or Memcached for recent feeds, long-term storage in Cassandra/Bigtable, and offline batch jobs for ML ranking. Consider freshness vs cost trade-offs, deduplication, and eventual consistency for likes/comments.
Q13. Design a File System
Core layers: Client API, Metadata Service (NameNode-like), Block Storage (DataNodes), Chunking & Replication, and Directory Structure. Files are split into fixed-size blocks; metadata stores file→block mappings, permissions, and timestamps. On read/write, client requests block locations from metadata and communicates directly with storage nodes. Include replication for durability, heartbeat for node liveness, and re-replication on failure. Consider consistency (write-once append vs POSIX semantics), caching, authentication, and quotas. For distributed design, support name-space partitioning and use consensus (Raft) for metadata high availability.
Q14. Design a Music Streaming Service
Components: User, Catalog, Track, Playlist, StreamingService, CDN, Transcoder, RecommendationEngine, and Billing. Tracks stored in object storage; manifest + bitrate variants generated by Transcoder. StreamingService authenticates, retrieves serving URL from CDN, and handles range requests. RecommendationEngine suggests playlists using collaborative and content-based filtering. Offline downloads require DRM and encrypted segments. Scale: shard catalog, use geo-CDNs for low latency, prefetch user queues, and enforce rate limits. Business considerations: content licensing, metadata quality, caching hot tracks, and analytics for engagement.
Q15. Design a Calendar Application
Core models: Event, Calendar, User, Invitation, RecurrenceRule, Notification, and Availability. Support creating events, recurring events, time-zone handling, and attendee RSVPs. Conflict checking uses free/busy queries across participants. Recurrence uses RFC 5545 rules with exception dates. NotificationService sends reminders via email/push; allow calendar sharing with ACLs. For scale, separate read-heavy views (list of events) from write operations (event creation) and use optimistic locking for concurrent edits. Handle daylight saving changes and consistent invite state across devices.
Q16. Design a Notification System
The system has Event Producers, Ingestion Layer (queue/Kafka), Notification Service, Channel Adapters (email, SMS, push), User Preferences, Retry/Backoff, and Monitoring. Producers emit events; consumers transform them into notifications respecting user preferences and throttling rules. Use a priority queue for urgent messages and batch similar messages. For reliability, persist events and implement retry with exponential backoff and dead-letter queues. Rate limit per user/channel and provide delivery receipts. Ensure idempotency and privacy (suppress sensitive channels).
Q17. Design a Logging Framework
Components: Logger API, Appender/Handler, Formatter, Log Store, and Indexing/Query. Logger API (sync/async) writes structured logs; Appenders forward to files, remote collectors, or streaming services. Buffer logs in local agents to survive network blips and ship to centralized stores (Elasticsearch, Splunk) for search and analytics. Include log rotation, sampling, dynamic log levels, and correlation ids for tracing. Consider security (PII masking), retention policies, and alerting integration to detect anomalies or errors.
Q18. Design a Cache System
A cache layer stores hot data to reduce latency. Key components: Cache Node, Coordinator/Client Library, Eviction Policy (LRU/LFU), Replication, Consistency Protocol, and Monitoring. Choose between local in-process cache, distributed cache (Redis, Memcached), or CDN. For distributed caches, decide on sharding (consistent hashing) and replication for availability. For write-heavy workloads, consider write-through vs write-back vs cache-aside patterns and TTLs. Handle cache warm-up, cache stampede via locking or request coalescing, and stale-data strategies for eventual consistency.
Q19. Design a Task Scheduler
Task scheduler orchestrates jobs across machines. Components: Job API, Scheduler, Worker Nodes, Job Store, Queue, Executor, and Monitor. Support cron-like recurring jobs, dependency graphs, retries, priorities, and backoff strategies. Scheduler picks tasks based on resource availability and constraints, dispatches them to workers over RPC, and tracks state. For distributed scheduling, partition jobs by namespace, use leader election for scheduling ownership, and persist job state in durable storage. Provide metrics, dashboards, and alerting for failed tasks.
Q20. Design a Chat Application
Core models: User, Conversation, Message, Presence, DeliveryReceipt, and Attachment. Use a WebSocket-based real-time channel for low-latency delivery; fallback to polling for legacy clients. Messages are persisted in a store, with a message queue for fan-out and offline delivery. Presence service tracks online/offline status and last-seen. Support typing indicators, read receipts, and group chat. Ensure ordered delivery (sequence numbers) and idempotent handling for retries. For scaling, partition conversations by id, use cached recent messages for fast reads, and apply end-to-end encryption where required.
Q21. Design a URL Shortener
Core components: ShortenerService, Hashing/Encoding Module, DB (Mapping store), Redirect Service, Analytics, and Admin UI. Generate short keys using base62 encoding of an auto-increment ID, or use a collision-resistant hash (with collision resolution). Store mapping {shortKey → longURL, createdAt, expiry, clickCount} in a highly-available DB (Cassandra/Redis + RDBMS for metadata). Redirect service reads mapping and issues 301/302. Handle custom aliases, rate limits, and abuse detection. For scale, shard keys by prefix, cache hot mappings in Redis, and use CDN edge redirects to reduce latency. Add URL validation, TTL, and soft-delete for moderation.
Q22. Design a Rate Limiter
Rate limiting enforces request quotas per user/IP. Key modules: Token Bucket / Leaky Bucket algorithms, Client Library, Central Store (Redis), and Policy Engine. Token bucket allows bursts and refills tokens periodically. Use Redis INCR with TTL or Lua scripts to implement atomic check-and-decrement semantics across distributed servers. Support multiple policies (per-second, per-minute, sliding window). For multi-region deployments, apply local limits at edge and global sync for hard limits, or use client-specific tokens with central reconciliation. Provide headers (X-RateLimit-Remaining) and graceful throttling (429 responses + Retry-After). Monitor and whitelist trusted clients.
Q23. Design a Pub-Sub System
A thread pool manages worker threads to execute tasks concurrently. Key components: Worker Threads, Task Queue, ThreadPool Manager, Rejection Policy, and Metrics. The manager maintains a core and maximum size, keeps threads alive for a keep-alive duration, and pulls tasks from a bounded queue. Rejection policies include Abort, CallerRuns, DiscardOldest, and Discard. Implement graceful shutdown with awaitTermination and draining queues. For robustness, catch task exceptions, optionally resubmit failed tasks, and instrument queue lengths and worker utilization. Tune pool sizing to match CPU-bound vs I/O-bound workloads.
Q24. Design a Thread Pool
A thread pool manages worker threads to execute tasks concurrently. Key components: Worker Threads, Task Queue, ThreadPool Manager, Rejection Policy, and Metrics. The manager maintains a core and maximum size, keeps threads alive for a keep-alive duration, and pulls tasks from a bounded queue. Rejection policies include Abort, CallerRuns, DiscardOldest, and Discard. Implement graceful shutdown with awaitTermination and draining queues. For robustness, catch task exceptions, optionally resubmit failed tasks, and instrument queue lengths and worker utilization. Tune pool sizing to match CPU-bound vs I/O-bound workloads.
Q25. Design a Connection Pool
Components: Producers, Brokers, Topic Partitions, Consumers, Consumer Groups, Coordinator, and Durable Storage (commit logs). Producers publish messages to topics; brokers append to partitioned logs. Consumers in a group read partitions in parallel and commit offsets for fault tolerance. Use partitioning for parallelism and ordering guarantees per partition. Provide at-least-once default delivery with optional exactly-once semantics via idempotent producers and transactional commits. Use Kafka-like design for scalability: leader-follower replication, controller for partition assignments, and retention policies. Support push (broker→consumer) or pull models; add ACLs and quotas for multi-tenancy.
Q26. Design an In-Memory Key-Value Store
Core parts: Storage Engine, Persistence Layer (AOF / snapshot), Eviction Policy, Replication, Sharding, Client Protocol, and Monitoring. Use hash tables for O(1) access, optional data structures (sorted sets, lists), and a background snapshot thread for persistence. Implement replication (master-slave) for durability and quorum writes for consistency trade-offs. Eviction policies include LRU, LFU, and TTL-based expiry. For scale, shard data using consistent hashing and support resharding with minimal disruption. Add access control, Lua scripting for atomic operations, and metrics for hit/miss rates.
Q27. Design Splitwise
Key entities: User, Group, Expense, Transaction, and Settlement Engine. When an expense is added, split rules (equal, percent, exact) compute owed balances. Maintain per-pair ledger entries and aggregate net balances per user to minimize settlements. Settlement engine suggests minimal transactions using graph balancing (net creditors vs net debtors). Support recurring expenses, multi-currency conversion, and fairness rules. Ensure eventual consistency for concurrent edits (optimistic locking) and provide audit logs. For scalability, partition groups by id and cache frequently accessed balances client-side.
Q28. Design a Voting System
Models: Election, Candidate, Voter, Ballot, Tally Engine, and Auditing. For single-round elections, provide ballot submission APIs, enforce voter eligibility, and ensure one-vote-per-person via authentication and tokenization. Store ballots immutably and compute tallies with streaming aggregation. For ranked-choice voting, implement instant-runoff tallying algorithm. Security: end-to-end verifiability, tamper-evident logs (append-only), and cryptographic receipts if required. Prevent double-voting via strong identity checks and rate limits. Ensure high availability on election day and rigorous backups; support manual recounts and audit trails.
Q29. Design a Coupon System
The coupon system manages creation, validation, application, and redemption of promotional codes. Core classes: Coupon, Campaign, Eligibility Engine, Redemption Service, and Usage Store. Coupons may have constraints (validity window, user segments, min order value, item restrictions, max redemptions). On checkout, the Eligibility Engine evaluates rules; a transactional redemption step ensures atomic decrement of quota (use DB row-level locking or Redis Lua script). For high throughput promotions (flash sales), pre-allocate coupon pools and use optimistic concurrency or token buckets to prevent oversubscription. Provide reporting for ROI and fraud detection.
Q30. Design a Warehouse Management System
Core components: Inventory, Location (slots/racks), Inbound/Outbound Processing, Order Picking, Replenishment, and Workforce Management. Model items with SKUs and map to physical locations. Inbound flow: receive, QC, assign storage location, and update inventory. Outbound: pick-pack-ship with optimized pick routes (batching, zone picking). Track lot numbers, expiry dates, and FIFO/LIFO policies. Integrate barcode/RFID scanners and conveyor automation. Ensure strong consistency for stock levels (use reservations/locks during order processing) and eventual sync for analytic reporting. Scale across multiple warehouses with centralized catalog and distributed stock management.
Q31. Design a Job Scheduler
A Job Scheduler submits, schedules, and runs jobs (batch or ad-hoc). Core components: Job API, Job Store, Scheduler (Leader), Worker Pool, Queue / Broker, Executor, Retry / Backoff, and Monitor. Jobs register with metadata (cron, one-shot, dependencies, resource hints). The scheduler picks runnable jobs based on time and dependencies, enqueues tasks to workers via a broker (Kafka/RabbitMQ), and persists job state. Workers fetch tasks, execute, report status, and emit logs/metrics. Ensure idempotency, lease-based locking for at-most-once execution, and durable retries. Scale by partitioning jobs by namespace and running multiple schedulers with leader election (Raft/ZooKeeper).
Q32. Design an Autocomplete System
Autocomplete returns ranked suggestions as users type. Key pieces: Query API, Trie/Prefix Index, Inverted Index, Ranking Service, Cache/Edge, and Usage Logger. Build prefix indices (tries or n-gram indices) for fast lookup; store candidate metadata (popularity, recency). Ranking uses signals: frequency, personalization, geography, and context. For large corpora, precompute top-K suggestions per prefix and cache them in Redis/edge. Support fuzzy matching and typo tolerance using BK-trees or edit-distance-aware indices. Maintain asynchronous updates via streaming ingestion (Kafka) and use A/B testing to measure relevance. Provide latency SLAs (<50–100ms) for interactive UX.
Q33. Design a Leaderboard System
Leaderboards rank users by score (global or per-category). Components: Score Ingest API, Score Store, Ranking Engine, Top-K Cache, and User View Service. For high write rates, buffer updates and use incremental aggregators to update sorted structures. Use Redis Sorted Sets for small-to-medium scale for O(log N) updates and O(k) reads. For massive scale, shard by leaderboard ID, maintain per-shard top-K and merge on read. Support rank queries (user rank, nearby ranks) via reverse-scans or point queries. Ensure eventual consistency for scalability and provide snapshot export for analytics. Handle ties deterministically (timestamp or user-id).
Q34. Design a Messaging Queue
A messaging queue decouples producers and consumers. Core parts: Producers, Broker Nodes, Partitioned Queues, Consumer Clients, Offset Management, Replication, and Retention. Producers append messages to partitioned, append-only logs; brokers replicate partitions for durability. Consumers pull messages and commit offsets; consumer groups provide parallelism. Support delivery semantics (at-least-once default), message ordering per partition, and exactly-once via idempotent processing and transactional commits. Add flow-control, backpressure, and dead-letter queues for poison messages. Scale by partitioning topics and adding brokers; use controller nodes for metadata and leader election.
Q35. Design a Stock Exchange Matching Engine
A matching engine matches buy and sell orders by price-time priority. Components: OrderBook, OrderEntry API, Matching Core, Risk Checks, Trade Reporter, and Audit Trail. Orders (limit, market, IOC) are inserted into price-level queues (bids/asks). The matching loop attempts to match crossing orders, producing trades and updating order states. Use in-memory, low-latency data structures with persistence for durability and replay. Enforce checks (balances, limits, circuit breakers). Scale via partitioning by instrument and colocating engines for hot instruments; prioritize deterministic single-threaded matching per instrument to avoid concurrency issues. Emphasize latency, reliability, and regulatory auditability.
Q36. Design an Auction System
Auction system supports listing items, bidding, and winner determination. Main components: Auction Catalog, Bid API, Real-Time Engine (WebSockets), Bid Store, Timer Service, and Payment Gateway. Bids must be validated (auth, min increment). For real-time auctions, publish current-highest bid to subscribers and prevent race conditions via atomic compare-and-set in a central store (Redis with Lua or a dedicated leader). Support different auction types (English, Dutch, sealed-bid). Handle last-second sniping with soft close (extend end time on late bids). Maintain audit logs and anti-fraud checks. Scale by partitioning auctions and using backpressure for high-load events.
Q37. Design a Loyalty Points System
A loyalty system tracks points earned, redeemed, and tier status. Components: User Profile, Points Ledger, Earning Rules Engine, Redemption Service, Balance Cache, and Reporting. Use append-only ledgers for immutable records and computing balances by aggregation; cache current balances for quick reads. Rules engine evaluates promotions, multipliers, and expiry policies. For redemptions, use transactional operations or optimistic locking to prevent double-spend. Provide reconciliation jobs and fraud monitoring. Support tier upgrades/downgrades based on rolling activity windows and integrate with external partners via secure APIs. Ensure GDPR compliance for user data.
Q38. Design a Content Management System (CMS)
A CMS manages authoring, publishing, and serving content. Core components: Content Model, Authoring UI, Workflow / Approval, Versioning, Storage (Blob + Metadata DB), CDN, and Search Index. Authors create content (articles, pages) with metadata and media. The workflow supports drafts, reviews, scheduling, and publishing. Versioning enables rollback. For serving, pre-render pages and cache via CDN for low latency; dynamic personalization can be layered on top. Use Role-Based Access Control (RBAC) for editors and audit logs for edits. Scale by separating authoring (write-heavy) from serving (read-heavy) and using microservices for media processing.
Q39. Design a Survey System
A survey platform supports creating forms, distributing them, and collecting responses. Components: Form Builder, Question Types, Response Store, Distribution (links, emails), Analytics, and Export. Forms can have branching logic, validation rules, and quotas. Responses are stored in schemaless stores (for dynamic question sets) or normalized schemas for analytics. Provide near-real-time dashboards and sampling/export to CSV/BI tools. Ensure data privacy (anonymization) and rate limits to prevent spam. For high response rates, use write-optimized stores (Cassandra) and batch analytics pipelines to compute aggregates efficiently.
Q40. Design a Recommendation Engine
Recommendation systems suggest items using collaborative filtering (user-item interactions), content-based models (item features), and hybrid approaches. Components: Event Collector, Feature Store, Model Training Pipeline, Online Ranker, and Serving Cache. Offline jobs compute embeddings and collaborative signals; online ranker combines retrieval (candidate generation) and re-ranking (personalized scoring). Use approximate nearest neighbor (ANN) indices for fast similarity lookups. Personalization should consider freshness, popularity, business rules, and diversity. A/B test models continuously and log exposures for debiasing. Scale by grouping users, sharding candidate indices, and providing low-latency caches for hot users.
Q41. Design a Version Control System
A version control system (VCS) manages file changes, branching, and merges. Core components: Repository, Commit Object, Blob/Tree Storage, Reference Store (branches/tags), Index/Staging Area, and Network Protocol (push/pull). Commits are immutable DAG nodes pointing to trees (snapshots) and parents. Local operations (commit, branch) are cheap; pushes/pulls synchronize repositories via packfile transfers and delta compression. Merge strategies include fast-forward, three-way merge, and conflict resolution tools. For scalability, use object storage (S3) for large repos, and implement packfile/garbage-collection to reclaim space. Security: signed commits and access control for protected branches.
Q42. Design a Comment System
A comment system supports threaded comments, moderation, and reactions. Components: Comment, User, Thread, Moderation Queue, Notification, and Search/Index. Comments stored with parent pointers to enable nesting; flattening strategies (materialized path or adjacency lists) speed reads. Implement visibility rules (public/private), upvote/downvote, and spam detection. Moderation supports automated filters and manual review; soft deletes preserve audit trail. Scale reads by caching top-level threads and using pagination or lazy-loading replies. Provide rate limits to prevent abuse and a rich-text sanitizer to avoid XSS vulnerabilities.
Q43. Design a Subscription Management System
Manage plans, billing cycles, trials, and renewals. Key modules: Plan Catalog, Customer Account, Subscription, Billing Engine, Payment Gateway Integration, Invoice & Notifications, and Proration Logic. Subscription lifecycle: signup → trial → active → cancel → expire. Billing engine runs recurring jobs, applies taxes and discounts, and handles failed payments with retry/exponential backoff. Support upgrades/downgrades with proration and usage-based billing. Ensure idempotent billing operations, strong reconciliation, and PCI-compliant payment handling. For scale, decouple billing events via queues and use event sourcing for auditability.
Q44. Design a Deeper Job Scheduler
For complex workflows, scheduler supports DAGs of jobs with dependency resolution. Components: Workflow Definition (DAG), Execution Engine, Dependency Tracker, Task Queues, and State Store. Jobs transition states (pending, running, success, failed). Scheduler enforces dependency order and parallelism constraints, supports retries with backoff, and allows manual reruns of failed nodes. Persist execution lineage and outputs for reproducibility. For long-running workflows, checkpoint intermediate results and support partial retries. Use leader election to assign workflow owners and partition workflows for horizontal scale.
Q45. Design an A/B Testing Platform
A/B platform routes users to experiment buckets, collects metrics, and reports significance. Components: Experiment Manager, Bucketing Service, Event Collector, Metric Store, and Analyzer. Bucketing uses consistent hashing or user-id bucketing to ensure sticky assignments. Events are logged and aggregated in streaming pipelines; metric computation runs offline or online for real-time dashboards. Support experiment targeting, multi-armed and multi-metric experiments, and guardrails (rollback on negative impact). Ensure low-latency bucketing (<5ms) for production traffic and avoid leakage between variants. Use statistical methods (t-tests, Bayesian) and correct for multiple comparisons.
Q46. Design an Analytics Event Pipeline
Event pipeline ingests, processes, and stores telemetry for analytics. Components: SDK/Collector, Ingestion (Kafka), Stream Processing (Flink/Spark Streaming), Storage (OLAP like ClickHouse/BigQuery), and Query/Visualization. Producers batch and compress events client-side. Streaming jobs perform enrichment, deduplication, sessionization, and compute rolling aggregates. Store raw events in cold storage for replay and aggregates in OLAP for fast queries. Implement schema registry for compatibility, monitoring for data quality, and backpressure handling. Provide replay mechanisms and data lineage for auditability.
Q47. Design an Access Control System
Access control manages who can do what. Components: Identity Store, Policy Store, Policy Decision Point (PDP), Policy Enforcement Point (PEP), and Audit Logs. RBAC maps users → roles → permissions. ABAC evaluates policies based on attributes (user, resource, environment). PDP evaluates requests against policies and returns allow/deny; PEP enforces the decision in services. Use policy languages (Rego/Open Policy Agent) for flexible rules. Cache decisions for performance, but ensure TTL invalidation on policy changes. Strong auditing and least privilege defaults are essential.
Q48. Design a Feature Flag System
Feature flags allow toggling behavior without deployments. Components: Flag Store, Evaluation Service, SDKs, Rollout Rules, Targeting, and Dashboard. Flags can be boolean, multivariate, or percentage-based. Use a consistent, low-latency SDK that evaluates flags locally (with cached configs) and periodically refreshes. For gradual rollouts, use percentage bucketing and target by user segments. Ensure safety: default-off fallback, kill-switch for emergencies, and audit logs. For critical flags, tiered rollout with canary and monitoring is recommended.
Q49. Design a Data Versioning System
Data versioning tracks dataset snapshots across time (like DVC). Components: Dataset Store, Metadata Catalog, Commit Objects, Diff & Merge Tools, and Storage (object store). Track data changes via content-addressable storage, store pointers (manifests) and metadata. Support branching, tagging, and reproducibility of experiments. For large files, use deduplication and chunked uploads. Integrate with ML workflows for model training, and provide tools for comparing dataset snapshots and lineage. Ensure access controls and efficient retrieval of deltas for incremental processing.
Q50. Design a Secrets Management System
Secrets manager securely stores API keys, DB credentials, and certificates. Components: Secret Store, Access Policy Engine, Audit Logs, Secret Rotation, and Client SDKs. Secrets are encrypted at rest (KMS-managed keys) and served via authenticated APIs with fine-grained access control. Offer automatic rotation and versioning, and support short-lived credentials via dynamic backends. Protect against exfiltration by restricting network access, enforcing least-privilege, and auditing every access. Provide secret injection for CI/CD and environment variable templating for services.
Q1. Exception Metric Collection & Alerting System
Design a pipeline to ingest millions of exceptions/sec, index them for search, and alert on thresholds. Core components: Client SDKs (structured exception capture), Ingress Layer (Kafka), Stream Processor (Flink/Spark) for enrichment (service, stackframe hashing), Indexing/Store (Elasticsearch or ClickHouse for counts), and Alerting Engine. Deduplicate using fingerprinting (hash of stack trace + message) to collapse noise and compute Top-K/Bottom-K queries via aggregation windows. Alerts use rate/threshold rules and anomaly detection; route via Pager/SMS/Slack. Scalability: partition by service, shard indices by time, and use retention tiers (hot/warm/cold). Provide sampling, backpressure, and circuit-breakers to avoid overload during error storms.
Q2. Feature Button Click Count System
Collect click events for product analytics with low latency aggregation. Components: Client Instrumentation, Ingestion (Kafka), Real-time Aggregator for per-button counters (streaming state store), Online Store (Redis for counters), and Batch OLAP (for long-term analytics). Use idempotent event IDs to avoid duplicates and windowed aggregation for minute/hour/day metrics. For scale, partition events by feature ID or user shard and maintain local aggregators to reduce cross-network traffic. Provide APIs for dashboards and rate-limited real-time feeds. Consider consistency trade-offs: serve slightly stale counts (seconds) for performance; provide eventual consistency guarantees and backfill absent windows.
Q3. Calendar System
Design events, invites, recurrences, and notifications for millions of users. Core components: Event Service, Calendar Store (partitioned DB), Invitation Service, Availability & Free/Busy Service, Recurring Rules Engine (RFC 5545), and Notification Service. Use per-user calendars sharded by user ID. For invites, implement atomic invite state transitions and conflict checks using optimistic locking or reservations. For recurring events, store rules and compute occurrences lazily (materialize on demand) to avoid explosion. Handle time zones and DST carefully. For scale, cache busy windows for quick conflict checks, and offload heavy queries to precomputed indices. Provide replication and cross-device sync with change tokens.
Q4. Distributed Job Scheduler
Schedule and reliably execute jobs across a fleet. Components: API/Orchestrator, Metadata Store, Leader Election, Task Queue (Kafka), Worker Fleet, and Coordination Service (Zookeeper/Etcd). Partition jobs by namespace; leaders manage scheduling for assigned partitions. Use leases to assign tasks to workers and persist task state for recovery. For dependency DAGs, persist edges and track readiness counts. Support retries with backoff, durable logs for audit/replay, and sidecar health checks. Scalability: horizontally scale schedulers, use consistent hashing to assign job ownership, and autoscale workers based on queue depth. Ensure idempotent tasks and implement rate limiting to protect downstream systems.
Q5. Catalog Enhancement System at Google Scale
Enhance product catalog events (ingest, enrich, validate, augment) for analytics and search. Pipeline: Event Ingestion (Pub/Sub) → Validation Service → Enrichment/Normalization (mapping taxonomies, prices, images) → Feature Computation → Serving Stores (search index, analytics DB, materialized views). Use stream processing for near-real-time enrichment and batch jobs for expensive enrichments (image tagging, ML inference). Maintain schema registry and apply semantic versioning for events. For scale, horizontally partition by merchant/catalog ID and use backpressure with durable queues. Provide replay for reprocessing on schema fixes. Ensure idempotency and monitor freshness SLAs; maintain lineage for auditing and debugging.
Q6. LinkedIn Feed System
Serve personalized feeds at billion-user scale. Components: Producer API (writes), Storage (cold: HDFS/Blob, hot: Redis/Cassandra), Feed Generator (hybrid push/pull), Ranking Service (ML), and Serving Layer (edge caches/CDN). Use fan-out-on-write for users with moderate followers (precompute) and fan-out-on-read for celebrities to avoid huge writes. Candidate generation retrieves relevant posts (social graph traversal, recent activity, engagement), then ranker reorders using ML signals (relevance, recency, user preference). Cache top-K per user and use background re-ranking to reduce read latency. Handle consistency (eventual), spam detection, deduplication, and experiments per cohort. Scale with sharding by user ID and offline pipelines for feature computation.
Q7. LinkedIn Messaging System
Design a low-latency, highly available messaging service for 1:1 and group chats. Components: Client SDK (WebSocket/HTTP), Front-end Gateways, Message Store (hot cache + durable DB), Delivery Queue, Presence Service, and Push Notification Adapters. Use append-only logs per conversation for ordering and fan-out to active connections for real-time delivery. Persist messages for history and offline delivery. For durability, replicate partitions and use consensus for leader election. Support read receipts, typing indicators, and message edits/deletes with CRDT or operational transforms for conflict resolution. Partition conversations by convo-id and colocate hot partitions; provide end-to-end encryption if required and rate-limit to mitigate spam.
Q8. Job Recommendation Engine
Serve job recommendations personalized to candidate profiles and behaviors. Components: Event Collector, Feature Store (user & job embeddings), Candidate Retriever (ANN indices), Re-ranker (ML model), and Feedback Loop. Offline training computes embeddings using collaborative, content, and graph signals. Online retrieval uses ANN (FAISS/HNSW) for low-latency nearest neighbors and re-ranks with contextual features (freshness, location, application intent). Use feature cache for active users and log exposures/clicks to retrain models. Address fairness and diversity via constraints in re-ranking. Scale by sharding indices by geography/industry and keep cold-start strategies (popularity, rule-based) for new users/jobs.
Q9. LinkedIn Mutual Connection Search
Find mutual connections across massive graph with low latency. Components: Graph Store (adjacency lists in distributed DB), Indexer (precomputed 2-hop lists for heavy users), Query Engine, and Cache. Naïve approach: fetch neighbors of both users and intersect — costly if nodes are high-degree. Optimize by: (1) iterating the smaller adjacency list and checking membership in a hash-set of the larger; (2) precompute and cache mutual counts for frequent pairs; (3) use Bloom filters to quickly check membership; (4) use two-hop materialized views for rapid mutual discovery. For huge graphs, limit searches by degrees or use approximate top-K mutuals. Ensure privacy controls respecting connection visibility.
Q10. LinkedIn People Search
Design scalable people search with ranking, filters, and facets. Core components: Indexing Pipeline (ingest profiles, skills, work history), Search Index (Elasticsearch/Solr), Query Frontend, Ranker (ML), Aggregation Service (facets), and Authorization Layer. Precompute signals (recency, shared connections, mutual company) and store them as doc fields for fast retrieval; use candidate generation + neural re-ranker for personalization. Support complex filters (location, title, industry) and fuzzy matching for names. Provide strict access controls to hide profiles per privacy settings. Scale by sharding indices and using multi-stage ranking to keep latency low; use offline pipelines for feature refresh and AB testing for ranking models.
Q11. Design an IDE like Visual Studio
An IDE includes editor, compiler integration, project explorer, debugging, and plugins. Components: Editor Engine, Language Server Protocol (LSP) Integration, Build System, Debugger, File System Watcher, and Plugin Framework. LSP enables syntax highlighting, auto-completion, diagnostics. Debugger interacts with runtime using protocols like DAP. Scale by plugin sandboxing and incremental compilation.
Q12. Design Rate Limiter
A distributed rate limiter enforces quotas across services. Components: Gateway, Limiter Service, Distributed Counter (Redis/Etcd), and Policy Store. Implement sliding window, token bucket, or leaky bucket. Use Lua scripts or atomic operations for correctness. Cache limits locally to reduce latency. Provide global vs per-region consistency modes. For very high throughput, use hierarchical rate limiting (edge + centralized).
Q13. Design Notification Service
Supports email/SMS/push notifications. Components: Event Ingestion, Notification Orchestrator, Channel Adapters, User Preference Store, Retry/Backoff, and DLQ. Events routed via Kafka. Orchestrator checks user preferences and rate limits, then triggers channel adapters. Ensure idempotency and use exponential backoff. Push notifications go through APNS/FCM. Scale using horizontal workers and per-channel queues.
Q14. Design Analytics System
Pipeline: Collector → Ingestion (Kafka) → Stream Processing (Flink) → Storage (ClickHouse/BigQuery) → Dashboard. Collector batches events from clients; stream processors aggregate metrics (DAU/MAU, funnels). OLAP store supports fast analytical queries. Scale via partitioning by event type, optimizing cold/hot storage, and schema evolution.
Q15. Design Search Engine
Supports crawling, indexing, and ranking. Components: Crawler, Parser, Indexer, Inverted Index, Query Engine, Ranking (PageRank + ML), and Storage. Crawl web pages, parse tokens, index them with doc metadata, compute ranking signals. Query engine performs BM25 scoring + ML relevance. Scale by sharding index, caching query results, and using distributed file systems.
Q16. Design Recommendation System
Pipeline: Event Logging, Feature Store, Candidate Retrieval (ANN), Reranking (ML models), Serving Layer. Offline training computes embeddings; online retrieval fetches nearest neighbors and scores them. Scale through ANN indices, caching, and personalized feed generation.
Q17. Design API Gateway
API Gateway handles routing, rate limiting, authentication, SSL termination, and request transformations. Components: Gateway Proxy, Policy Engine, Rate Limiter, Auth (OAuth/JWT), and Observability. Scale with L7 load balancers, horizontal replicas, and circuit-breaking to protect backend services.
Q18. Design Distributed Cache
Distributed cache stores hot data for low-latency access. Components: Cache Nodes, Sharding (consistent hashing), Replication, Eviction Policies, and Client Library. Use Redis Cluster or Memcached. Handle cache miss storms with request coalescing. Scale via node addition/removal and shard rebalancing.
Q19. Design API Gateway
Handles request routing, authentication, rate limiting, metrics, and retries. Use Envoy/Nginx as the data plane, with a control plane managing config. Provide JWT validation, schema validation, and circuit breaking. Scale horizontally.
Q20. Design Distributed Key-Value Store
A distributed KV store offers low-latency get/put operations. Components: Coordinator, Sharding (consistent hashing), Replication (quorums), Storage Nodes, Write Path (WAL), Compaction, and Anti-Entropy (Merkle trees). Support linearizable or eventual consistency via quorum reads/writes. Scale by adding shards and rebalancing automatically.
Q21. Content Delivery Network (CDN) — Global Edge Caching
A CDN caches content close to users to reduce latency and origin load. Core components: Edge PoPs, Origin Servers, Global Control Plane, Routing/Anycast, and Cache Invalidation. Request routing uses Anycast + DNS or HTTP redirection to direct users to nearest PoP. Edges serve cached objects (static files, media); on miss, they fetch from origin and store copies with configurable TTL. Invalidation supports purge and versioned URLs. For scale, partition content by geography, use hierarchical caching (regional + edge), and auto-scale origin capacity. Trade-offs: aggressive caching improves latency but complicates freshness; design for cache stampede mitigation (request coalescing) and secure signed URLs for private content.
Q22. DN: Video Streaming at Scale
Video CDN must serve large, sequential segments with adaptive bitrate (ABR). Pipeline: Uploader → Transcoder → Segmenter → Origin → Edge CDN. Transcoder generates multiple bitrate renditions; segmenter slices into chunks with manifests (HLS/DASH). Edge PoPs cache segments; player requests manifests then segments, switching bitrates as bandwidth changes. Support byte-range requests for seeking and range caching. For scale, pre-warm edge caches for viral content, leverage HTTP/2/3 and QUIC, and use CDN edge logic for bitrate-aware caching. Handle live streaming via shorter segments and low-latency protocols (LL-HLS, CMAF, WebRTC). Ensure consistent encryption/DRM across renditions.
Q23. DNS + Global Load Balancing
Global load balancing routes users to healthy regional endpoints. Use Anycast to route to nearest PoP and Geo-DNS for policy-based routing. Combine with health checks and intelligent traffic steering (weighted, latency-based). A control plane updates DNS or BGP advertisements during failover. For application-level balancing, use global load balancers (layer-7) that inspect headers to route by continent, compliance rules, or user affinity. Consider TTL tuning and DNS caching behavior when performing failovers. Trade-offs: Anycast gives fast routing but complicates fine-grained traffic splits; Geo-DNS offers control but is slower to converge.
Q24. Authentication & Authorization Service
Central auth service issues access tokens and enforces policies. Components: Identity Provider (IdP), Token Service, Authorization Service (PDP/PEP), User Store, and Session Management. Use OAuth2/OpenID Connect for delegated auth; issue signed JWTs for stateless verification. For fine-grained policies, use ABAC via a PDP (e.g., OPA) and PEP in service gateways. Maintain refresh token flows for long-lived sessions and revocation lists for security. Scale token issuance horizontally and cache public keys for verification. Trade-offs: JWTs enable low-latency validation but require strategy for revocation and key rotation; short-lived tokens reduce risk.
Q25. Observability / Monitoring & Alerting Platform
Collect metrics, traces, and logs; detect incidents and surface alerts. Components: Telemetry SDKs, Ingestion (Kafka), Metrics Store (Prometheus/TSDB), Tracing (Jaeger), Log Store (Elasticsearch/ClickHouse), and Alerting Engine. Stream processing aggregates metrics into rollups; traces link requests across services; logs carry context. Alerts are based on thresholds, anomaly detection, or SLO burn rates and integrate with paging tools. Design retention tiers (hot/warm/cold) and sampling for traces. Important: high cardinality controls, backpressure handling, and secure telemetry ingestion. Balance retention cost vs operational need.
Q26. Metrics Pipeline & Time-Series Platform
A high-throughput metrics pipeline ingests counters/gauges, aggregates, and serves dashboards. Components: Edge Aggregators, Ingestion Queue, Real-time Aggregator, TSDB (write-optimized), and Query Layer. Ingest edges perform local rollups to reduce cardinality; stream processors compute minute/hour/day aggregates; TSDB stores raw and aggregated series. Use partitioning by metric name or team to scale writes. Support downsampling and retention policies. Provide fast top-k and dimension-filter queries via precomputed indices. Trade-offs: aggressive aggregation reduces storage but loses granularity for debugging.
Q27. Real-time Stream Processing Platform
Process events with low latency (e.g., counts, joins, enrichments). Components: Event Bus (Kafka), Stream Processor (Flink/Spark Streaming), State Store, and Sink Connectors. Support event-time processing with watermarks to handle late data. Stateful operators use local RocksDB-backed state with changelog replication for fault tolerance. Scale by partitioning topics and operator parallelism. Ensure exactly-once or at-least-once semantics depending on sinks; use idempotency at sink if needed. Provide metrics for operator lag and automated scaling policies.
Q28. Batch Processing System / Data Lake Jobs
Batch systems process large datasets for analytics/training. Components: Job Scheduler (YARN/K8s), Distributed FS (HDFS/S3), Execution Engine (Spark), and Metadata Catalog (Hive/Glue). Jobs read from object storage, shuffle/aggregate, and write outputs to data lake. Optimize with file formats (Parquet), partitioning, and columnar compression. For large clusters, use autoscaling, spot instances, and workload isolation (quota/namespace). Emphasize reproducibility with job manifests and versioned inputs. Trade-offs: batch is cost-efficient for heavy compute but has higher latency vs streaming.
Q29. Data Warehouse / OLAP Design
OLAP stores support complex ad-hoc analytics and BI. Use Columnar Stores (BigQuery/Redshift/ClickHouse), ETL/ELT Pipelines, and Materialized Views for common aggregations. Organize schemas in star/snowflake patterns and partition by time for pruning. Provide IAM for dataset access and row-level security. For scale, separate storage and compute (serverless or cluster-based) and use query caching. Ensure data freshness SLAs via incremental loads and CDC. Focus on query performance tuning and cost control (e.g., query quotas).
Q30. ETL / ELT Pipeline
Ingest transactional DB changes into analytics without heavy loads. Components: CDC Capture (Debezium), Event Bus, Transform (stream/batch), and Data Lake/Warehouse Loaders. CDC captures row-level changes with commit order, preserving exactly-once semantics via offsets. Transform steps normalize, enrich, and validate data. For ELT, raw change logs land in staging and are transformed in the warehouse. Manage schema evolution with a registry and provide backfill/replay capability. Key trade-offs: CDC offers near-real-time analytics but complicates schema migration and ordering guarantees.
Q31. Graph Data Platform / Graph Query Service
Serve graph queries (shortest path, mutual connections) at scale. Components: Graph Store (partitioned adjacency lists), Indexing (degrees, labels), Query Engine, and Cache. Partition by vertex-id using hash or community-aware partitioning to reduce cross-shard traversal. For short path queries, use bidirectional BFS with heuristics (bounded depth) and precomputed two-hop indexes for heavy nodes. Use caching for frequent queries and Bloom filters for quick membership checks. Trade-offs: balancing low-latency traversal vs storage overhead for precomputed indices.
Q32. Geospatial Search & Proximity Service
Support geo-queries (radius search, nearest neighbors). Components: Geo Index (geohash/R-tree), Tile-based Sharding, Routing API, and Reverse Geocoding. Index points in geohash buckets and for a radius query, probe adjacent buckets and filter by precise Haversine distance. Use specialized stores for heavy loads (Elasticsearch with geo indices or spatial DBs like PostGIS). For real-time moving objects, use chronologically partitioned updates and in-memory geo indexes for hot regions. Trade-offs: bucket granularity affects false positives/scan cost.
Q33. Time-Series Database
Store high-cardinality metric streams with efficient writes and compression. Components: Write Frontend, Ingest Shards, Local TSDB Instances (downsample & compaction), and Query Engine. Optimize for append-only workloads with time-ordered segments and index by metric+tags. Implement retention policies and downsampling pipelines for long term. Support high cardinality with label-based partitioning and use inverted indices for tag filtering. Provide PromQL or SQL dialect for queries and fast aggregations via precomputed rollups. Trade-offs: storage cost vs retention window.
Q34. Feature Store for ML
Centralize features for training and serving. Components: Ingestion/Transformation, Offline Store (Parquet / Data Warehouse), Online Store (low-latency DB like Redis/Cassandra), Feature Registry, and Serving API. Batch jobs materialize features for training; streaming jobs update online store for low-latency inference. Feature registry tracks versions and lineage. Ensure consistency between offline and online features via timestamping and backfills. Scale by partitioning features per entity (user/item) and caching hot entities. Key trade-offs: freshness vs storage and compute cost.
Q35. Model Serving & Prediction System
Serve ML models at low latency and high throughput. Components: Model Repository, Model Server (TF-Serving, TorchServe), Feature Fetcher, Request Router, and A/B Canary System. Models are deployed as containerized services behind an inference gateway. For heavy models, use batching or GPU-backed inference; for scale, autoscale replicas and use warm-up traffic for cold starts. Support multi-model routing via model version keys, and provide logging of inputs/outputs for monitoring and retraining. Maintain SLA for latency and enforce input validation and quotaing.
Q36. Experimentation / A-B Testing Platform
Platform for safe feature rollouts and impact measurement. Components: Experiment Manager, Bucketing Service, Exposure Logger, Metric Aggregator, and Analyzer. Assign users deterministically to variants using user-id hashing for stickiness, log exposures and events to stream for aggregation, and compute online metrics with stream processors. Support feature-level rollouts (percentage, targeting) and rapid rollback. Provide guardrails for negative impact detection and integrate with deployment pipelines for progressive rollouts. Ensure data quality and metric drift detection.
Q37. Payment Processing System
Handle authorizations, captures, refunds, and reconciliations. Components: Payment Gateway, PCI-compliant Vault / Tokenization, Acquirer/Processor Integrations, Settlement Engine, Fraud Checks, and Ledger. Tokenize card details, route transactions to multiple acquirers with failover, and record ledger entries for bookkeeping. Implement retry and idempotency for network errors and reconcile daily settlements. Ensure strong encryption, PCI scope minimization (use hosted fields), and regional compliance (e.g., PSD2/3D Secure). Scale by batching settlement operations and isolating fraud pipelines.
Q38. Fraud Detection System
Detect fraudulent activity using both rule-based and ML approaches. Components: Event Collector, Feature Enrichment, Real-time Scorer (low-latency), Batch Scorer (ML retraining), Alerting/Case Management, and Feedback Loop. Real-time scoring uses precomputed features for latency-sensitive checks (velocity, geo anomalies). Suspicious events trigger holds and human review. Offline models train on labeled data; online models are refreshed regularly. Maintain strict false-positive controls and provide explainability for decisions. Scale by feature caching and partitioning checks by account/region.
Q39. Video Streaming Architecture
Design both on-demand and live streaming: Uploader/Encoder, Origin Storage, CDN, Player, and Monitoring. For VOD, pre-transcode into renditions and distribute via CDN. For live, use ingest points, transcode farm, and low-latency streaming stacks (WebRTC / LL-HLS). Ensure end-to-end metrics (startup time, rebuffering) telemetry. Use CDN edge for segment caching and origin failover mechanisms. Implement DRM/secure tokening for protected content and offline playback via encrypted downloads.
Q40. Edge Compute Platform
Run user logic at the edge for low latency and locality. Components: Edge Runtime (containers/Workers), Deployment Orchestrator, Control Plane, State Sync, and Observability. Developers deploy functions/images to edge PoPs; control plane schedules and routes execution based on proximity and resource availability. Provide deterministic APIs for caching, KV store, and durable storage with eventual consistency. Address cold starts with warm pools and state synchronization with origin. Security model must isolate tenants and enforce resource quotas. Trade-offs: edge reduces latency but complicates consistent global state and increases deployment surface.