Skip to main content

Linearizability and the Single-Copy Illusion

What This Concept Is

Linearizability (Herlihy and Wing, 1990) is the strongest single-object consistency model for concurrent systems. Informally: a replicated or concurrent system behaves as if there were a single copy of the data, and every operation takes effect at some single instant between its invocation and response. That instant is its linearization point.

Formal shape: given a history of invocations and responses, the system is linearizable if there exists a total order on operations such that:

  1. The total order respects real-time order: if operation A returned before operation B started (in actual wall-clock time), then A precedes B in the total order.
  2. Each operation's return value is consistent with sequentially applying operations in that total order to a single-copy data type (register, queue, etc.).

Linearizability is about a single object. It composes: a system where every object is individually linearizable is linearizable as a whole. (It is orthogonal to serializability, which is about entire transactions over multiple objects.)

Why It Matters Here

When vendors say "strongly consistent" they usually mean linearizable. The single-copy illusion is what you want from a distributed lock service, a leader election service, a uniqueness check, a counter that must be monotonic across observers, and anything involving consensus. Linearizability is the property that Module 5's consensus algorithms exist to implement on replicated state.

Linearizability is also what you give up when you pick an eventually consistent system. Understanding precisely what is lost is the rest of the cluster.

Concrete Example

Three clients, one register x initialized to 0. Time flows left to right.

C1:   |--write(x,1)--|
C2: |--read(x) -> ?--|
C3: |--write(x,2)--|

Client 1 writes x=1, finishing before C3's write(x,2) starts. C2's read starts after C3's write finishes. Linearizable behaviors:

  • Linearize as write(1), write(2), read. read returns 2. Legal.
  • Linearize as write(2), write(1), read. This would order write(2) before write(1), but real time says C1's write finished before C3's write started. So this ordering violates real-time order. Illegal.

What about concurrent reads during the middle?

C1:   |--write(x,1)--|
C2: |--read(x) -> 1 or 0 --|

C2's read overlaps with C1's write. Either return value is legal if consistent with a linearization point inside the overlap. If C2 returned 0, some other read in C2 that occurs strictly after the overlap but still reads 0 would violate linearizability: once a read returned 1, no subsequent read can return 0 (the "register" semantics).

Common Confusion / Misconception

"Linearizable means serializable." No. Serializability is about transactions over multiple objects; linearizability is about single-object ordering with real-time respect. A system can be serializable without being linearizable (stale snapshots that are serializable as of some past time) and vice versa.

"Linearizable means consistent reads within a transaction." That is a transaction-level isolation property. Linearizability is a per-object ordering property.

"If my system uses Paxos, it is linearizable." Paxos on a replicated log plus a state machine gives linearizable operations on the state machine. But the clients' view can still see stale reads if reads bypass the log (leader-lease reads that are not confirmed via consensus, for example). Always ask where reads go.

"Linearizability is free if you have enough replicas." It always costs at minimum a round-trip to a majority for every write, and typically for reads too unless leases are in use. This is what Kleppmann calls "the cost of linearizability."

How To Use It

When you need linearizability:

  1. Leader-based replication with synchronous acknowledgment from a majority (Raft, Paxos).
  2. Reads go through the leader or require lease confirmation from a majority.
  3. The leader's term is fenced so stale leaders cannot serve post-failover reads.

Classic uses: distributed locks, leader election, unique identifier allocation, "last writer" tracking.

When you do not need linearizability, do not pay for it. A read replica for a dashboard is fine with bounded staleness; a cross-region social feed is fine with eventual consistency; a recommendation list does not need real-time order.

Check Yourself

  1. Define linearizability in terms of linearization points and real-time order.
  2. Give a history with three operations that is not linearizable, and explain which constraint it violates.
  3. Why does linearizability compose (an all-objects-linearizable system is linearizable) while serializability does not?
  4. What is the cost of a linearizable read in a majority-replicated system?
  5. Distinguish linearizability from serializability with one sentence each.

Mini Drill or Application

For each history, say whether it is linearizable. Argue by producing a legal linearization or showing none exists.

  1. C1: write(x,1) returns; C2: read(x) -> 0.
  2. C1: write(x,1) returns; C2: read(x) -> 1; C3: read(x) -> 0 where all reads and the write overlap pairwise.
  3. C1: cas(x, 0, 1) returns ok; C2: cas(x, 0, 2) returns ok (both compare-and-set on the same initial value, both succeed).
  4. C1: enqueue(q, 'a'); C1: enqueue(q, 'b'); C2: dequeue(q) -> 'b'; C2: dequeue(q) -> 'a' (FIFO queue).
  5. C1: write(x,1); C2 (concurrent): read(x) -> 0; C3 (later): read(x) -> 1; C4 (later than C3): read(x) -> 0.

Read This Only If Stuck