The Page Layout Design Challenge

Data page design faces two core tensions:

  1. Variable-length fields: strings, JSON, arrays — lengths are unpredictable
  2. In-place update vs append-only: balancing write and read performance

Slotted Pages

Slotted Pages is the most widely used page layout in relational databases — both PostgreSQL and MySQL InnoDB adopt this design.

┌──────────────────────────────────────┐
│              Page Header              │
│  ┌──────────┬──────────────┐         │
│  │ LSN (8B) │ Free Space Ptr│ ...     │
│  └──────────┴──────────────┘         │
├──────────────────────────────────────┤
│                                      │
│          Free Space (Gap)            │
│           ↓ ↓ ↓ ↓ ↓ ↓ ↓             │
│                                      │
├──────────────────────────────────────┤
│  Tuple 2 (id=2, name="Bob")         │
├──────────────────────────────────────┤
│  Tuple 1 (id=1, name="Alice")       │
├──────────────────────────────────────┤
│              Item Pointers            │
│  ┌─────────────────────────────────┐ │
│  │ Slot 1: offset=XXX, length=YYY  │ │
│  │ Slot 2: offset=XXX, length=YYY  │ │
│  └─────────────────────────────────┘ │
└──────────────────────────────────────┘

Key Features

  1. Tuples grow backward from page end; slot array grows forward from header
  2. Slot array records each tuple’s offset and length — supports variable-length fields
  3. Deletion just marks the slot — no data movement (vacuum reclaims space later)
  4. Heap-Only Tuple (HOT) updates: create new versions in-page, avoiding index updates

PostgreSQL Implementation

// src/include/storage/bufpage.h

typedef struct PageHeaderData {
    PageXLogRecPtr pd_lsn;      // WAL LSN
    uint16         pd_checksum; // checksum
    uint16         pd_flags;    // flags
    LocationIndex  pd_lower;    // start of free space
    LocationIndex  pd_upper;    // end of free space
    LocationIndex  pd_special;  // special space offset
    uint16         pd_pagesize_version;
    ShortTransactionId pd_prune_xid;
    ItemIdData     pd_linp[FLEXIBLE_ARRAY_MEMBER]; // row pointer array
} PageHeaderData;

Alternative Layouts

Log-Structured Pages

Never overwrite; append new versions:

Page v1: [Tuple A, Tuple B]
Page v2: [Tuple A, Tuple B', Tuple C]  ← append B', C
Page v3: [Tuple A, Tuple B', Tuple C']
  • Pros: extremely fast writes (sequential append), simple crash recovery
  • Cons: needs compaction to merge versions, read amplification
  • Used by: LSM-Tree (RocksDB), WiredTiger

PAX (Partition Attributes Across)

Columnar grouping within a page:

┌─────┬──────────┬──────────┬──────────┐
│     │ Column A │ Column B │ Column C │
│Row 1│  val_a1  │  val_b1  │  val_c1  │
│Row 2│  val_a2  │  val_b2  │  val_c2  │
│ ... │   ...    │   ...    │   ...    │
└─────┴──────────┴──────────┴──────────┘
  • Pros: efficient column-wise computation, high compression
  • Cons: row-level operations require multiple memory jumps
  • Used by: DuckDB, MonetDB/X100

Summary

Scheme Best For Example Systems
Slotted Pages OLTP (balanced R/W) PostgreSQL, MySQL
Log-structured Write-heavy RocksDB, Cassandra
PAX / Columnar Analytical queries DuckDB, Vertica

References