The Binary Heap as an Implicit Tree

What This Concept Is

A binary heap is a complete binary tree (every level full except possibly the last, which fills left-to-right) stored in a flat array, where every node satisfies the heap property.

Two variants:

min-heap: every node (\leq) its children -> root is the minimum
max-heap: every node (\geq) its children -> root is the maximum

Index math for 0-based arrays: [ \text{parent}(i) = \lfloor (i - 1) / 2 \rfloor, \quad \text{left}(i) = 2i + 1, \quad \text{right}(i) = 2i + 2. ]

Core operations:

peek -- read root, (O(1)).
sift_up(i) -- compare with parent, swap if out of order, repeat; fixes one upward violation in (O(\log n)).
sift_down(i) -- compare with the best child, swap, repeat; fixes one downward violation in (O(\log n)).
push(x) -- append at position (n), sift_up(n); (O(\log n)).
pop() -- swap root with last, shrink by one, sift_down(0); (O(\log n)).
build_heap(A) -- run sift_down from index (\lfloor n/2 \rfloor - 1) down to (0); (O(n)) total.

The implicit-tree trick is what makes heaps so cheap: no pointers, no allocations, perfect cache locality. A d-ary generalization widens the fan-out to (d) children (indices (di+1, \dots, di+d)), trading a shallower tree for more comparisons per sift_down; (d=4) is a common sweet spot on modern caches.

Distinctions vs. other ordered structures. A heap is not a binary search tree: heaps only order parent↔child, BSTs order the entire in-order traversal. A heap is not a sorted array: a sorted array has (O(1)) peek and (O(\log n)) search, but (O(n)) insert. A heap hits the (O(\log n)) insert/extract balance while keeping (O(1)) peek.

Why It Matters Here

Binary heaps are the foundation for:

Heapsort -- in-place, (O(n \log n)) worst case (the guarantee that introsort falls back to).
Priority queues -- Dijkstra, Prim, A*, Huffman, event-driven simulation.
Top-K selection and median maintenance -- sub-sort cost for streaming problems.
K-way merge for external sorting and multi-stream joins.
Segment / Fenwick trees -- cousin data structures using the same "encode a tree in an array" idea.

The linear build_heap is also the crisp example of amortized thinking in this module. Naively: each node sifts (\log n) levels -> (O(n \log n)). Correctly accounting for the depth of each node gives (O(n)), a bound you can re-derive from (\sum_{h=0}^{\log n} \lceil n / 2^{h+1} \rceil \cdot h = O(n)). Once you see this sum, the same pattern will show up in union-find, link-cut trees, and later graph algorithm analyses.

Concrete Examples

Example 1 -- build a max-heap from [3, 1, 6, 5, 2, 4].

Initial array (level-order tree reading):

index:   0  1  2  3  4  5
value:   3  1  6  5  2  4

       3
      / \
     1   6
    / \  /
   5  2 4

build_heap sifts down from index (\lfloor 6/2 \rfloor - 1 = 2) upward:

sift_down(2): node 6 vs. child 4 -- already heap-ordered.
sift_down(1): node 1 vs. children 5, 2 -- swap with 5 -> [3, 5, 6, 1, 2, 4].
sift_down(0): node 3 vs. children 5, 6 -- swap with 6 -> [6, 5, 3, 1, 2, 4]; then 3 at index 2 vs. child 4 -- swap -> [6, 5, 4, 1, 2, 3].

Final max-heap: [6, 5, 4, 1, 2, 3]. Every parent (\geq) its children.

Example 2 -- push + pop trace on a min-heap. Start from [1, 3, 2, 7, 5, 4]. push(0):

Append: [1, 3, 2, 7, 5, 4, 0], insert at index 6.
sift_up(6): parent index (2), value (2 > 0), swap -> [1, 3, 0, 7, 5, 4, 2].
sift_up(2): parent index (0), value (1 > 0), swap -> [0, 3, 1, 7, 5, 4, 2].

pop():

Swap root with last: [2, 3, 1, 7, 5, 4, 0], return 0, shrink to [2, 3, 1, 7, 5, 4].
sift_down(0): children 3 and 1 -- swap with 1 -> [1, 3, 2, 7, 5, 4]. Done.

Each operation touches (O(\log n)) positions on one root-to-leaf path, giving the worst-case bound.

Common Confusion / Misconceptions

"A heap is sorted." It is not. Heap order is local (parent↔child only); siblings and cousins have no ordering. Walking the array index-by-index does not produce a sorted sequence. To sort, you must repeatedly pop -- that is exactly what heapsort does.

"build_heap is (O(n \log n))." The naive bound double-counts: most nodes are near the bottom where sift_down does almost no work. Summing by level gives (\sum_h (n/2^{h+1}) h = O(n)). Inserting (n) elements one by one really is (O(n \log n)); doing the bottom-up build_heap is (O(n)).

"Heap search is (O(\log n)) like a BST." No -- search for an arbitrary key in a heap is (\Theta(n)) because there is no global order to exploit. Use a balanced BST, hash table, or skiplist for that.

"Ties preserve insertion order." A binary heap is not stable: two items with equal priority can emerge in any order. If stability matters (e.g., FIFO among equal priorities), add a tie-break counter into the key tuple.

How To Use It

When you need fast access to the min (or max) of a changing set:

Pick min-heap or max-heap based on which extreme you pop.
Add items with push in (O(\log n)); remove the extreme with pop in (O(\log n)).
If all data is available up front, call heapify once in (O(n)) instead of (n) pushes.
For ties you care about, embed a tie-break field in the key.
If you need decrease_key, either maintain an index map (key -> heap position) or use lazy deletion (push updated entries, filter stale entries on pop).
For d-ary heaps, use (d=4) on modern CPUs when sift_down cost dominates (it often does in Dijkstra with dense graphs).

Avoid heaps when:

You need fast search for arbitrary keys (use a hash table or balanced BST).
You need ordered traversal (use a balanced BST or sorted array).

Standard libraries: Python heapq.heappush/heappop/heapify (min-heap); C++ std::priority_queue (max-heap by default) or std::make_heap/push_heap/pop_heap on raw ranges; Java PriorityQueue (min-heap by default); Rust std::collections::BinaryHeap (max-heap).

Transfer / Where This Shows Up Later

S2 M2 (this module). Concept 15 turns the heap into heapsort and the generic PQ API; concept 16 shows it powering Dijkstra, median, and top-K.
S2 M3 (graphs). Dijkstra and Prim rely on a min-heap priority queue. A* uses a heap keyed by (f(n) = g(n) + h(n)). MST and shortest-path lower bounds are stated in terms of heap-op counts.
S2 M4 (DP & greedy). Huffman coding, Kruskal-like scheduling problems, and activity selection with deadlines use heaps.
S3 (software design). The heap-backed PQ is the canonical example of separating an ADT (priority queue) from its representation (binary heap); interface segregation lessons build on this.
S4 (systems programming). Linux kernel scheduler timers (hrtimer) use a per-CPU priority queue; event-driven I/O (epoll/kqueue timeout queues) use min-heaps.
S5 (OS). Process schedulers and memory reclamation (page-aging) reach for heaps; CFS in Linux uses a red-black tree instead, but the trade-off discussion references heaps.
S6 (databases). External sort's k-way merge uses a min-heap over (k) streams; top-N queries (ORDER BY ... LIMIT k) use a bounded max-heap internally (Postgres nodeLimit); LSM-tree compaction merges SSTables with a heap of cursors.
S7 (architecture). Rate-limiters and retry schedulers use priority queues keyed by earliest-allowed-time; architectural ADRs for SLO enforcement often cite heap-backed token buckets.
S8 (system design). Distributed job schedulers (Kubernetes priority preemption, Spark's DAGScheduler) use heaps; load balancers pick least-loaded backends from a heap of (load, backend) pairs.

Check Yourself

Why does build_heap run in (O(n)) rather than (O(n \log n))?
Give the parent/left/right index formulas for a 0-indexed array heap.
Why is heap search for an arbitrary key (\Theta(n)) despite the tree structure?
Why is a binary heap not stable, and how do you make the PQ stable?
Why does a 4-ary heap often outperform a 2-ary heap on modern CPUs for Dijkstra?

Mini Drill or Application

Build a min-heap from [9, 5, 6, 2, 3, 8, 1, 4] bottom-up. Draw the array after each sift_down.
Prove (\sum_{h=0}^{\log n} \lceil n/2^{h+1} \rceil h = O(n)) to confirm the linear build_heap bound.
Implement an array-backed min-heap with push, pop, and heapify from scratch (no stdlib). Add a unit test that replays 10,000 random operations and checks against a sorted reference.
Extend your heap to support decrease_key(i, new_key) given an index; argue why it is (O(\log n)).
Benchmark a 2-ary vs. 4-ary heap on 1M random pushes followed by 1M pops; explain the cache effect.

Read This Only If Stuck

CLRS: 6.1 Heaps
CLRS: 6.2 Maintaining the Heap Property
CLRS: 6.3 Building a Heap
Skiena: 4.3 Heapsort: Fast Sorting via Data Structures
Skiena: 4.3.2 Constructing Heaps
Sedgewick: Priority Queues
Sedgewick: Priority Queues (heap operations)
Grokking Algorithms: Priority queues intuition
Princeton Sedgewick: 2.4 Priority Queues (binary heaps) -- the canonical visual treatment with animated swim/sink.
MIT 6.006 Lecture 4: Heaps and Heap Sort -- clean derivation of the (O(n)) heapify.

What This Concept Is​

Why It Matters Here​

Concrete Examples​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​