Skip Lists as a Randomized Balanced-BST Replacement
What This Concept Is
A skip list is a layered linked-list structure that supports ordered dictionary operations in expected O(log n) time without using any rotations or invariants that have to be restored.
- level 0 is a sorted linked list of all keys
- level
i+1is a sorted sublist of leveli, where each element is promoted independently with probabilityp(commonly1/2) - search starts at the top level, walks right while the next key is smaller than the target, drops down a level when it overshoots, and continues
No node stores rank or balance information. The only decision per insert is "how many levels does this new node occupy?", chosen by repeated coin flips.
The geometric-distribution promotion produces, in expectation, n/2 nodes on level 1, n/4 on level 2, and so on. The expected number of levels is log_{1/p} n, and the expected search cost is log_{1/p} n / (1 - p). For p = 1/2 this is about 2 log_2 n, close to the best balanced-BST height bounds.
The crucial property is that the adversary selecting the input sequence cannot force bad shape. Once the keys are fixed, the internal coin flips decide the structure; the expected cost bound holds for every input. High-probability tail bounds (depth exceeds c log n with probability at most 1/n^{c-1}) make worst-case deviations negligible on large n.
Concurrent skip lists are the practical win that rotations cannot match. Because each level is an independent linked list, lock-free implementations (Java's ConcurrentSkipListMap, Doug Lea's design) can do fine-grained CAS updates per level instead of whole-tree rebalancing. This is one reason skip lists appear in production where balanced BSTs struggle.
Why It Matters Here
Skip lists are the cleanest example of a theme that recurs throughout this module: randomization can replace complicated invariant maintenance and still give the same expected bounds. They are used in production systems:
- Redis sorted sets (
ZADD,ZRANGEBYSCORE) are a skip list paired with a hash map - LevelDB / RocksDB memtables use skip lists for in-memory writes before flushing to SSTables
- Apache Cassandra uses skip list-like structures for its in-memory tables
- Java provides
ConcurrentSkipListMapas the default lock-free sorted map
Their code is dramatically shorter than red-black or AVL and their concurrent variants avoid whole-tree locks.
Concrete Examples
Example 1 -- search trace. Five keys inserted with the shown promotion levels:
level 2: HEAD ------------- 7 ------------ NIL
level 1: HEAD --- 3 ------- 7 ----- 12 --- NIL
level 0: HEAD -1- 3 -- 5 -- 7 -- 9- 12 -15 NIL
Search for 9:
- start at HEAD, level 2; next is
7, move right to7 - next at level 2 is
NIL, drop to level 1 - next at level 1 is
12 > 9, drop to level 0 - next at level 0 is
9, match
The search visited four nodes, which is close to log(6).
Example 2 -- insert with coin flips. Starting from an empty skip list with p = 1/2, insert keys 4, 1, 7, 3 using the coin-flip sequence HH, T, HHT, T where an H promotes to the next level and a T stops:
- insert
4: flipsHH-> level 2 -> node at levels 0, 1, 2 - insert
1: flipT-> level 0 only - insert
7: flipsHHT-> level 2 -> node at levels 0, 1, 2 - insert
3: flipT-> level 0 only
level 2: HEAD --- 4 --- 7 --- NIL
level 1: HEAD --- 4 --- 7 --- NIL
level 0: HEAD -1-3-4-----7--- NIL
Each insert does one search (to find predecessors at every level) and one splice (one write per promoted level). Cost is O(log n) expected because the search is O(log n) and the splice is O(height of new node), which is O(log n) expected.
Common Confusion / Misconceptions
"Skip list is a multi-level cache or B-tree." It is not. Every key lives at level 0; the higher levels are "express lanes" that let search skip past long stretches of the bottom list. Unlike a B-tree, each level holds the same keys at different densities, not disjoint partitions.
"Expected O(log n) means guaranteed O(log n)." With probability 1/n^c, a skip list can be deeper than c * log n. The bound is on the expectation, not a deterministic ceiling. For hard real-time systems this matters; for almost everything else it does not.
"p = 1/2 is optimal." The expected search cost is minimized near p = 1/e ≈ 0.37, but the difference versus p = 1/2 is under 10%. Implementers pick p = 1/2 because a single random bit decides the coin flip, which is cheaper than a real-valued comparison.
"You need a cryptographic RNG." You do not, unless the input is adversarial and can observe your randomness (e.g., in a hash-flooding attack on a networked service). For in-process use, xorshift or a well-seeded PRNG is fine.
How To Use It
Operational rules:
- Search: walk right at the highest level until the next key exceeds the target, drop down, repeat
- Insert: search-and-record update pointers at each level, generate the new node's level by repeated coin flips, splice it in at all levels up to its chosen height
- Delete: search-and-record update pointers, then unlink the node from every level it appears on
Expected search cost: O(log n) for p = 1/2. Expected space: O(n) (geometric distribution gives ~2n total pointers).
Implementation knobs to fix upfront:
- cap the maximum level at
log_{1/p}(n_max)to avoid pathological spikes - prefer a doubly linked level-0 list to support
predecessorinO(1)after a search - for concurrent use, use per-level CAS with version counters (Java's design)
Tradeoff versus red-black:
- simpler code: no rotations, no recoloring, no symmetric cases
- tail behavior is probabilistic, not deterministic
- concurrency is easier because each level has its own linked list
Transfer / Where This Shows Up Later
- S5 (OS): lock-free ordered maps in kernel-adjacent libraries and some filesystem journals use skip-list-style stacking; the "layered list" idea also underlies write-ahead logs with level-based compaction.
- S6 (databases): RocksDB and LevelDB memtables are skip lists; MongoDB's WiredTiger in-memory indices optionally use skip lists; Redis's sorted-set implementation is the canonical example.
- S7 (architecture): "write-heavy in-memory index" ADRs frequently land on skip lists for exactly the concurrency reason given above -- it is easier to explain correctness under contention than for a lock-free red-black tree.
- S8 (scale): distributed priority queues and time-sharded leaderboards layer skip lists over sharded storage (Redis Cluster's
ZADDacross shards), achieving expected-O(log n)semantics per shard with concurrent writers.
Check Yourself
- What determines the height of a newly inserted node?
- Why is the expected search cost
O(log n)forp = 1/2? - What does the skip list trade away compared to a red-black tree?
- Why does Redis use a skip list for sorted sets instead of a balanced BST?
- Give the high-probability bound for skip-list depth and explain where the exponent comes from.
- Why are concurrent skip lists easier to make lock-free than concurrent red-black trees?
Mini Drill or Application
- Build a skip list for the keys
2, 4, 6, 8, 10, 12, 14, 16, using explicit coin flips (or a provided sequence such asHHTTTHTH) to assign levels. Draw the result. - Trace
search(10)on your skip list, listing each node visited. - Insert
13using coin flipsHTand redraw. - Compute the total number of pointers in your structure and compare to
2n. - Implement a skip-list-based ordered map in Python or Rust (~80 lines). Benchmark
insertandlookupon 1M random integers and compare toSortedDict/BTreeMap.
Read This Only If Stuck
- Skiena: 3.4.3 Balanced search trees
- Competitive Programming: 2.3 Non-linear DS with built-in libraries (Part 1)
- Competitive Programming: 2.3 Non-linear DS with built-in libraries (Part 2)
- CLRS: What is a binary search tree (contrast)
- Sedgewick: Random numbers chapter (PRNG quality matters for skip lists)
- Pugh, W. (1990). Skip Lists: A Probabilistic Alternative to Balanced Trees, CACM 33(6)
- MIT 6.046J / 6.851 lectures on randomized data structures