av
WorkBlog

WAL - Write Ahead Log

Understanding Write Ahead Log (WAL) mechanism used to maintain durability in databases. Learn how WAL records work, PostgreSQL implementation, and replication slots.

Write Ahead Log (WAL) is a log-based mechanism used to maintain durability in databases. Each WAL record represents an operation in the database, whether it's INSERT, UPDATE, DELETE, or other operations.

Structure of a WAL Record

A WAL record is a binary record with the following structure:

1. LSN (Log Sequence Number)

  • A unique ID assigned to WAL records
  • Stored in the WAL header (8 bytes)

2. Main Parts:

  • part[0] - Determines if the record is relevant
  • part[1] - Specifies the operation type performed
  • part[2+] - Contains the actual metadata/payload

WAL Record Structure

PostgreSQL and WAL

PostgreSQL uses WAL for durability and consistency:

  1. Write-First Principle - All changes are written to WAL before modifying data files
  2. Crash Recovery - WAL logs can be replayed to restore consistent state after crashes
  3. Background Writing - Dirty pages (modified in-memory pages) are periodically written to disk
  4. Decoding Plugins - WAL records need plugins for human-readable format:
    • pgoutput - Built-in, faster, binary format
    • wal2json - Extension, slower, JSON format

Decoder Plugins Comparison

Replication Slots

Think of these as a cursor in a log file, which tells how far we have come in WAL records. Or you can say it’s kind of a bookmark in a book.

Why these bookmark is needed?

It prevents the system in discarding the WAL records which are not yet processed among all the consumers.

Types of this bookmark?

1. Physical slot — tracks raw binary WAL records
2. Logical slot — tracks logical decoding position (specific point in the WAL records determined by LSN)

Way it works —
1. Create logical slot using sql statement, and give it a name my_slot in this case.

SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');

2. PostgreSQL starts retaining WAL records from this point.
3. A consumer connects and fetches these records (will be explained below, what I meant by this).
4. WAL records are not discarded until the slot is moved further by the consumer.

Consumer — It’s can be anything, a NodeJS/Go or any app. Which can use this information to perform necessary action like.
1. database replication — consuming and applying WAL records events
2. notify admins in case schema update is made in the database.
and much more…


Type of fields -

Depends upon the plugin but they are the following generally.
1. LSN — log sequence number
2. Transaction Info
3. Operation type — INSERT / UPDATE / DELETE, etc.
4. Schema and Tables info — Those which were involved in the record.
5. Column Affected
6. Row data — Before and after data/snapshot of the row.
7. Additional Metadata

LSN Tracking

Consumers manage their progress through WAL records by:

  1. Maintaining Position - Tracking the current LSN they're processing
  2. Progress Monitoring - Knowing how many records have been consumed
  3. Acknowledgment - Confirming consumption to advance the replication slot, allowing cleanup of processed records

Fetching WAL Changes

  1. Polling mode — Where a consumer asks replication slot about the changes after a frequent interval again and again.
SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, NULL);

2. Streaming mode — In this changes are pushed continuously to the consumer as soon as they occurs. It can be done using
replication=database parameter in the connection string.

START_REPLICATION SLOT my_slot LOGICAL <LSN>;

Change Data Capture (CDC)

WAL records enable powerful CDC capabilities:

  • Real-time Analytics - Capture data changes as they happen
  • Auditing - Track all modifications for compliance
  • Replication - Sync data across multiple systems
  • Event Sourcing - Build event-driven architectures

Instead of querying tables directly, CDC reads WAL records to detect changes efficiently.

Next Steps

In the next blog post, we'll build a practical CDC implementation using Go and Redis.