B+-Tree Structure and Invariants
What This Concept Is
A B+-tree is a balanced search tree designed for page-oriented storage. It has two node kinds:
- Internal nodes hold
kseparator keys andk+1child pointers. They only route searches; they hold no row data. - Leaf nodes hold the actual indexed entries, sorted by key. In most engines, leaf nodes are chained in a doubly linked list so a range scan can sweep left or right after finding the start.
For order m (maximum children per internal node), the invariants are:
- every path from root to leaf has the same length (balanced)
- every node (except possibly the root) has at least
ceil(m/2)children - every leaf holds between
ceil((m-1)/2)andm-1entries - keys within a node are sorted; a key
k_iin an internal node satisfieschild_ikeys< k_i <= child_{i+1}keys - all leaves sit at the same depth
The "plus" in B+-tree matters: values live only in leaves. That lets internal nodes be dense with keys, which increases fan-out, which makes the tree shorter.
Why It Matters Here
Almost every relational engine's primary index is a B+-tree (PostgreSQL, MySQL InnoDB, SQLite, SQL Server). The tree's invariants are what guarantee:
O(log n)worst-case lookup, insert, and delete- ordered traversal in
O(n)through the leaf chain - logarithmic depth even after arbitrary insert/delete sequences
Every cost estimate you make later assumes these invariants are preserved; if an operation would violate them, the engine splits or merges nodes to restore them.
Concrete Example -- Insert with Split
Start with an order-4 B+-tree (max 3 keys per leaf). Insert keys 10, 20, 30.
[10 | 20 | 30] (single leaf, also root)
Now insert 40. The leaf overflows. Split it at the median: left leaf gets 10, 20; right leaf gets 30, 40. Copy the separator 30 up to a new internal node:
[30]
/ \
[10 | 20] [30 | 40] <-- leaves linked left->right
Insert 50, 60. The right leaf overflows again at 60:
[30 | 50]
/ | \
[10|20] [30|40] [50|60]
Insert 5, 15, 25: both the leftmost and middle leaves will split. Eventually the internal node itself can overflow and split, promoting a separator up, possibly creating a new root. A new root is the only way the tree grows in height.
Concrete Example -- Delete with Merge
Continuing from the three-leaf tree above, delete 40:
[30 | 50]
/ | \
[10|20] [30] [50|60] <-- [30] is underflowed (needs >=2 keys)
The engine tries redistribution first: borrow from a sibling if the sibling has a spare. Here [10|20] has only 2 keys (minimum), so it cannot spare one. The engine then merges [30] with a neighbor. Merge [10|20] and [30] into [10|20|30], remove the separator 30 from the parent:
[50]
/ \
[10|20|30] [50|60]
The parent now holds only [50]. If this were not the root, it too might underflow and trigger a merge upward. At the root, a parent with only one child collapses and the child becomes the new root -- this is the only way the tree shrinks in height.
Common Confusion / Misconception
"B-tree and B+-tree are the same." They are not. In a classic B-tree, values live in internal nodes too. In a B+-tree, internal nodes only route. The B+ variant wins on range scans (walk the linked leaves) and on fan-out (internal nodes have no payload competing for space).
"A full node immediately splits." Not quite. Some implementations pre-split on the way down to avoid a second traversal; others split on the way up after inserting. The external behavior is equivalent, the cost profile differs.
How To Use It
For any insert or delete into a B+-tree, ask:
- Which leaf does this touch? (follow separators from root)
- Will the operation cause overflow (insert) or underflow (delete)?
- If so, which sibling can redistribute?
- If no sibling can help, which nodes merge or split, and does the separator promotion reach the root?
Thinking in invariants rather than in code keeps you out of corner cases.
Check Yourself
- Why are all leaves at the same depth, and how is this invariant maintained across splits?
- Why does a B+-tree have two minimum-fill rules (one for internals, one for leaves) rather than one?
- Why does the tree grow only at the root and shrink only at the root?
Mini Drill or Application
Using an order-4 B+-tree (internal fan-out 4, max 3 leaf entries):
- Insert
5, 10, 15, 20, 25, 30, 35, 40, 45, 50in order. Draw the tree after each split. - Delete
20, 25in order. Draw the tree after each step, marking redistribute vs merge. - State the final height and the number of leaf nodes.