Buffer Pool Management

Why Buffer Pools?

Disk I/O is the primary performance bottleneck in databases. A random disk read costs ~0.1–1ms, while memory access is ~100ns. The buffer pool caches hot pages in memory, minimizing disk access.

Buffer Pool Architecture

┌────────────────────────────────────────────┐
│              Buffer Pool                    │
│                                            │
│  ┌──────┐ ┌──────┐ ┌──────┐     ┌──────┐  │
│  │Frame 0│ │Frame 1│ │Frame 2│ ... │Frame N│ │
│  │ Page  │ │ Page  │ │ Free  │     │ Page  │ │
│  │  #102 │ │  #55  │ │       │     │  #204 │ │
│  │usage=3│ │usage=1│ │       │     │usage=2│ │
│  └──────┘ └──────┘ └──────┘     └──────┘  │
│                                            │
│  ┌──────────────────────────────┐          │
│  │      Clock Algorithm          │          │
│  │   Hand → Frame 2             │          │
│  └──────────────────────────────┘          │
└────────────────────────────────────────────┘

Page Lifecycle

[Disk] ──read──→ [Buffer Pool Free Frame]
                     │
                     ├── Pin (refcount +1)
                     │
                     ▼
               [Active Page]
                     │
                     ├── Unpin (refcount -1)
                     │
                     ▼
               [Evictable Page]
                     │
              ┌──────┴──────┐
              │              │
         (Evicted)     (Re-pinned)
              │              │
              ▼              ▼
          [Disk]      [Active Page]

Replacement Algorithms

Clock

See PostgreSQL Clock-sweep Algorithm Walkthrough for a deep dive.

Core idea: treat the buffer pool as a circular buffer with a clock hand. On each eviction attempt:

Advance hand
If current page usage_count > 0: decrement and continue
If usage_count == 0: select as victim, flush dirty page, replace

LRU-K

An enhanced LRU variant that tracks timestamps of the last K accesses, predicting the next access time. Better than pure LRU at preventing sequential scan pollution.

Concurrency Control

Buffer pool concurrency must balance performance and correctness:

Mechanism	Granularity	Purpose
Latch	Per-frame	Protects page content read/write
Partitioned lock	Frame groups	Reduces global contention
Lock-free (CAS)	Per-frame	High-performance scenarios

PostgreSQL uses a hybrid of per-frame LWLock + atomic state bits.

Prefetching

For predictable access patterns like sequential scans and index scans, proactively load likely-needed pages:

Sequential scan prefetch:
Scan Page 0 → background load Pages 1–3
Scan Page 1 → cache hit ✓
Scan Page 2 → cache hit ✓

Summary

Buffer pools are the backbone of database performance
Clock algorithm is a practical concurrent replacement for LRU
Multi-level locking (partitioned + per-frame) is a critical optimization
Prefetching significantly boosts sequential scan performance

References

PostgreSQL Source: bufmgr.c
MySQL InnoDB: Buffer Pool
Alex Petrov, Database Internals, Chapter 2