Write Ahead Log (WAL) is a log-based mechanism used to maintain durability in databases. Each WAL record represents an operation in the database, whether it's INSERT, UPDATE, DELETE, or other operations.
Structure of a WAL Record
A WAL record is a binary record with the following structure:
1. LSN (Log Sequence Number)
- A unique ID assigned to WAL records
- Stored in the WAL header (8 bytes)
2. Main Parts:
- part[0] - Determines if the record is relevant
- part[1] - Specifies the operation type performed
- part[2+] - Contains the actual metadata/payload

PostgreSQL and WAL
PostgreSQL uses WAL for durability and consistency:
- Write-First Principle - All changes are written to WAL before modifying data files
- Crash Recovery - WAL logs can be replayed to restore consistent state after crashes
- Background Writing - Dirty pages (modified in-memory pages) are periodically written to disk
- Decoding Plugins - WAL records need plugins for human-readable format:
- pgoutput - Built-in, faster, binary format
- wal2json - Extension, slower, JSON format

Replication Slots
Think of these as a cursor in a log file, which tells how far we have come in WAL records. Or you can say it’s kind of a bookmark in a book.
Why these bookmark is needed?
It prevents the system in discarding the WAL records which are not yet processed among all the consumers.
Types of this bookmark?
1. Physical slot — tracks raw binary WAL records
2. Logical slot — tracks logical decoding position (specific point in the WAL records determined by LSN)
Way it works —
1. Create logical slot using sql statement, and give it a name my_slot in this case.
SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');
2. PostgreSQL starts retaining WAL records from this point.
3. A consumer connects and fetches these records (will be explained below, what I meant by this).
4. WAL records are not discarded until the slot is moved further by the consumer.
Consumer — It’s can be anything, a NodeJS/Go or any app. Which can use this information to perform necessary action like.
1. database replication — consuming and applying WAL records events
2. notify admins in case schema update is made in the database.
and much more…
Type of fields -
Depends upon the plugin but they are the following generally.
1. LSN — log sequence number
2. Transaction Info
3. Operation type — INSERT / UPDATE / DELETE, etc.
4. Schema and Tables info — Those which were involved in the record.
5. Column Affected
6. Row data — Before and after data/snapshot of the row.
7. Additional Metadata
LSN Tracking
Consumers manage their progress through WAL records by:
- Maintaining Position - Tracking the current LSN they're processing
- Progress Monitoring - Knowing how many records have been consumed
- Acknowledgment - Confirming consumption to advance the replication slot, allowing cleanup of processed records
Fetching WAL Changes
- Polling mode — Where a consumer asks replication slot about the changes after a frequent interval again and again.
SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, NULL);
2. Streaming mode — In this changes are pushed continuously to the consumer as soon as they occurs. It can be done using
replication=database parameter in the connection string.
START_REPLICATION SLOT my_slot LOGICAL <LSN>;
Change Data Capture (CDC)
WAL records enable powerful CDC capabilities:
- Real-time Analytics - Capture data changes as they happen
- Auditing - Track all modifications for compliance
- Replication - Sync data across multiple systems
- Event Sourcing - Build event-driven architectures
Instead of querying tables directly, CDC reads WAL records to detect changes efficiently.
Next Steps
In the next blog post, we'll build a practical CDC implementation using Go and Redis.