Skip to main content

Storage Engine Foundations Lab

Retrieval Prompts

  1. State the approximate latency gap between main memory, an SSD random read, and an HDD random seek, in rough orders of magnitude.
  2. Describe the layout of a slotted page, in one paragraph, without looking anything up.
  3. Explain what the buffer pool stores, and what the pin_count and dirty flags mean.
  4. State the LRU replacement rule, and one reason it can go badly wrong.
  5. Explain why a record's RID must remain stable across in-page moves.

Compare and Distinguish

Separate these pairs clearly:

  • random vs sequential I/O on an SSD
  • heap file vs sorted file vs clustered B+-tree
  • logical page read vs physical I/O
  • slot entry vs record body
  • dirty page vs pinned page

Common Mistake Check

For each statement, identify the error:

  1. "SSDs eliminate the random-vs-sequential distinction."
  2. "A full table scan on a cold database touches N random pages, where N is the number of rows."
  3. "When a row is deleted, the slot is reused immediately for the next insert."
  4. "A buffer pool hit is free."
  5. "LRU is the best replacement policy for any workload."

Mini Application

Do all four tasks for each scenario:

  1. list the physical actions the engine performs
  2. classify each action as sequential or random I/O
  3. estimate the number of page reads (use an 8 KB page assumption)
  4. name at least one optimization that would reduce cost

Scenarios:

  1. SELECT COUNT(*) FROM logs on a 20 GB heap file, no indexes.
  2. SELECT * FROM users WHERE id = 17 on a PK-clustered 5 GB table.
  3. INSERT INTO orders VALUES (...) where orders has 3 secondary B+-tree indexes.
  4. DELETE FROM events WHERE ts < now() - interval '365 days' matching 10^7 rows.
  5. Startup scan of the buffer pool for a freshly restarted engine (what is cold, what is warm).

Buffer Pool Simulation

For a 4-frame pool and the access sequence A B C D E A B C F A B D E F A B C D, compute the number of hits and misses under:

  1. FIFO
  2. LRU
  3. Clock (single reference bit)

Then write one sentence on why each result differs.

Evidence Check

This page is complete only if you can narrate in pages and I/Os what happens during a simple query on a cold database, before you reach for any index structure.