Why Buffer Pools?
Disk I/O is the primary performance bottleneck in databases. A random disk read costs ~0.1–1ms, while memory access is ~100ns. The buffer pool caches hot pages in memory, minimizing disk access.
Buffer Pool Architecture
┌────────────────────────────────────────────┐
│ Buffer Pool │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Frame 0│ │Frame 1│ │Frame 2│ ... │Frame N│ │
│ │ Page │ │ Page │ │ Free │ │ Page │ │
│ │ #102 │ │ #55 │ │ │ │ #204 │ │
│ │usage=3│ │usage=1│ │ │ │usage=2│ │
│ └──────┘ └──────┘ └──────┘ └──────┘ │
│ │
│ ┌──────────────────────────────┐ │
│ │ Clock Algorithm │ │
│ │ Hand → Frame 2 │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────────────┘
Page Lifecycle
[Disk] ──read──→ [Buffer Pool Free Frame]
│
├── Pin (refcount +1)
│
▼
[Active Page]
│
├── Unpin (refcount -1)
│
▼
[Evictable Page]
│
┌──────┴──────┐
│ │
(Evicted) (Re-pinned)
│ │
▼ ▼
[Disk] [Active Page]
Replacement Algorithms
Clock
See PostgreSQL Clock-sweep Algorithm Walkthrough for a deep dive.
Core idea: treat the buffer pool as a circular buffer with a clock hand. On each eviction attempt:
- Advance hand
- If current page
usage_count > 0: decrement and continue - If
usage_count == 0: select as victim, flush dirty page, replace
LRU-K
An enhanced LRU variant that tracks timestamps of the last K accesses, predicting the next access time. Better than pure LRU at preventing sequential scan pollution.
Concurrency Control
Buffer pool concurrency must balance performance and correctness:
| Mechanism | Granularity | Purpose |
|---|---|---|
| Latch | Per-frame | Protects page content read/write |
| Partitioned lock | Frame groups | Reduces global contention |
| Lock-free (CAS) | Per-frame | High-performance scenarios |
PostgreSQL uses a hybrid of per-frame LWLock + atomic state bits.
Prefetching
For predictable access patterns like sequential scans and index scans, proactively load likely-needed pages:
Sequential scan prefetch:
Scan Page 0 → background load Pages 1–3
Scan Page 1 → cache hit ✓
Scan Page 2 → cache hit ✓
Summary
- Buffer pools are the backbone of database performance
- Clock algorithm is a practical concurrent replacement for LRU
- Multi-level locking (partitioned + per-frame) is a critical optimization
- Prefetching significantly boosts sequential scan performance
References
- PostgreSQL Source: bufmgr.c
- MySQL InnoDB: Buffer Pool
- Alex Petrov, Database Internals, Chapter 2