Checkpoints - Get SDE Ready

Database Management System (DBMS)

Checkpoints

In database systems, checkpoints are a critical part of the crash recovery mechanism. They are used to minimize the amount of work needed during recovery after a system crash. A checkpoint is like taking a snapshot of the current state of the database at a specific moment in time.


What is a Checkpoint?

A checkpoint is a point of synchronization between the log file and the database. At this moment:

All dirty pages (pages with modified data) in memory are written to the disk.
All log records up to that point are also flushed to the disk.
A special checkpoint record is added to the log to indicate that a checkpoint has occurred.

This helps in reducing the recovery time significantly by limiting how far back the system must scan the log during recovery.


Why are Checkpoints Needed?

Without checkpoints, in the event of a crash, the recovery system would need to go through all log entries from the very beginning to identify what needs to be redone or undone. This could be slow and inefficient.

Checkpoints serve two purposes:

Efficiency: Reduces the number of log records to process during recovery.
Consistency: Ensures the database and log are in a synchronized state.


Types of Checkpoints

There are two main types of checkpointing strategies:

1. Sharp Checkpoint

All updates are paused temporarily.
All logs and dirty pages are flushed to disk at once.
Simple to implement, but not suitable for high-concurrency systems due to the brief pause (blocking).


2. Fuzzy Checkpoint

The system continues processing transactions while the checkpoint is happening.
A list of active transactions and dirty pages is recorded.
More complex but allows higher system availability and performance.


Checkpoint Procedure (Fuzzy Checkpoint Example)

Here’s a basic outline of how a fuzzy checkpoint works:

System records the start of the checkpoint in the log.

It creates a list of:

Active transactions
Dirty pages

These lists are logged.

The system flushes all log records and modified data pages to disk asynchronously.

Finally, the end of checkpoint is written to the log.

This tells the recovery system: “From this point on, everything before has been safely written. You only need to worry about what happened after this.”


Checkpoint & Recovery

When a crash occurs:

The recovery process looks for the last checkpoint in the log.
It skips all the transactions that were completed before the checkpoint.
Only transactions active at the time of the checkpoint and those started after it are considered for undo/redo.

This dramatically reduces the time and complexity of crash recovery.


Real-World Analogy

Think of a checkpoint like saving your progress in a game. If your computer crashes, you don’t want to start the entire game over — you just want to resume from your last save point. Similarly, the database uses checkpoints to save its progress so that it can recover efficiently if something goes wrong.

Quick Links

Quick Links

Social Media

Quick Links

Quick Links

Social Media

Hi Instagram Fam! Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam! Get a FREE Cheat Sheet on System Design

Loved Our YouTube Videos? Get a FREE Cheat Sheet on System Design.

Hi Instagram Fam!
Get a FREE Cheat Sheet on System Design.

Hi LinkedIn Fam!
Get a FREE Cheat Sheet on System Design