Lamport Clocks and Happens-Before
What This Concept Is
In 1978 Leslie Lamport defined a partial order on events in a distributed system called happens-before (written ->). The rules:
- If events
aandbhappen on the same process andacomes beforeb, thena -> b. - If
ais the sending of a message andbis the receipt of that message, thena -> b. - If
a -> bandb -> c, thena -> c(transitive).
Two events are concurrent (written a ∥ b) if neither a -> b nor b -> a. This is the honest definition of concurrency in a distributed system.
A Lamport clock (scalar logical clock) assigns each event an integer timestamp L(e) such that a -> b implies L(a) < L(b). Each process maintains a counter. The update rule:
- On any local event:
counter := counter + 1. Tag the event with the new value. - On sending a message: tag the message with
counter. - On receiving a message with timestamp
m:counter := max(counter, m) + 1. Tag the receive event.
The clock produces a consistent ordering: if a -> b, then L(a) < L(b). The converse is not true: L(a) < L(b) does not imply a -> b. They might be concurrent.
Why It Matters Here
Lamport clocks are the minimum-metadata way to get a total order consistent with causality. Many systems use them directly or indirectly: Cassandra's write timestamps, Kafka offsets (roughly), ordered multicast protocols, and many "sequence number" ordering schemes you will see in Cluster 4.
You will also encounter their limitation: Lamport clocks cannot detect that two events are concurrent. They can only give you an order that respects causality; they cannot distinguish concurrent events from causally-ordered ones. That is what vector clocks (next concept) fix.
Concrete Example
Three processes. Pi sends events; messages drawn as arrows. Initial counters are 0.
Event Lamport timestamps: a=1, b=1, c=2, d=3, e=4, f=5.
a -> c(message) andL(a)=1 < L(c)=2. Good.a ∥ b(concurrent) andL(a)=1 = L(b)=1. Tie-break by process id if you need a total order.L(b)=1 < L(e)=4, butb ∥ e(never communicated). Lamport timestamps cannot tell you they are concurrent - they only look linearly ordered.
Common Confusion / Misconception
"If L(a) < L(b), then a happened before b." No. Lamport timestamps give a consistent total order (respecting ->), not a characterizing one. Two concurrent events will have timestamps in some order; you cannot conclude causality from scalar counters.
A second misconception: "The Lamport counter is wall-clock-ish." It is not a time. It is a counter of events seen. A process that talks to no one will have a slowly-growing counter; a chatty process will have a fast-growing one. There is no physical meaning.
How To Use It
To get a total order in a system without shared wall-clock time:
- Maintain a Lamport counter on each node.
- Include it in every message you send.
- On receipt, update as
max(local, received) + 1. - If you need a tie-breaker for equal Lamport timestamps, use process id.
The resulting (L, pid) pair gives a total order consistent with causality. That is enough for many uses: sequence numbers for database writes, ordering of log entries, and the sequence-number approach to total-order broadcast.
Check Yourself
- Why does
a -> bimplyL(a) < L(b)? - Give an example where
L(a) < L(b)butaandbare concurrent. - What property does the
+1on every event preserve? - Why is a
(L, pid)pair useful as a tie-breaker?
Mini Drill or Application
Take the diagram above but add two more messages: P3 -> P1 (after f) and P1 -> P2 (after the arrival from P3). Compute the Lamport timestamps for all new events by hand. Identify any new pairs of concurrent events.