Three-Phase Commit and Paxos Commit
What This Concept Is
Both protocols address the single nasty property of 2PC: that a participant can be left blocked indefinitely if the coordinator fails after collecting votes.
Three-Phase Commit (3PC)
3PC adds a pre-commit phase between PREPARE and COMMIT:
- CAN-COMMIT? (prepare-to-prepare): coordinator asks each participant if it can commit. Each replies yes/no without yet preparing.
- PRE-COMMIT: if all voted yes, coordinator sends PRE-COMMIT. Each participant acknowledges. Before this point, any party (including participants) can safely abort on timeout because nobody has preprared yet.
- DO-COMMIT: coordinator sends DO-COMMIT; participants commit.
Key property: if a participant reaches pre-commit, it knows every other participant also reached pre-commit, so if the coordinator dies, participants can elect a new one and safely finish the commit. If they did not all reach pre-commit, the participants agree to abort.
3PC is non-blocking under the assumption of a synchronous network with bounded message delays and reliable timeout-based failure detection. That assumption is exactly what asynchronous distributed systems do not offer. In practice, 3PC is vulnerable to network partitions: two sides of a partition can reach inconsistent conclusions because each side times out on the other and assumes the worst.
Paxos Commit
Paxos Commit (Gray and Lamport, 2006) replaces the coordinator with a Paxos-based consensus group. Instead of one coordinator whose failure blocks the protocol, the "decision" (commit or abort) is agreed by a majority of acceptors running Paxos. Each participant also has its vote replicated via Paxos. As long as a majority of Paxos acceptors are reachable, the decision can be made and propagated; failure of any minority is survived.
Paxos Commit is the commit protocol used in systems like Spanner, CockroachDB, and several other distributed SQL databases. It is essentially 2PC where each of coordinator and per-participant state is consensus-replicated.
Why It Matters Here
You need to know that 3PC exists, why it does not work in the asynchronous networks that actual data centers approximate, and that modern distributed databases avoid 2PC's blocking property not by adopting 3PC but by replacing the coordinator with a consensus group (Paxos or Raft). This is the bridge into Module 5.
Concrete Example
3PC happy path
C: CAN-COMMIT? -> P1, P2
P1, P2: YES
C: PRE-COMMIT -> P1, P2
P1, P2: ACK
C: DO-COMMIT -> P1, P2
P1, P2: commit; ACK
3PC coordinator crashes after PRE-COMMIT
P1 and P2 both saw PRE-COMMIT. They time out on the coordinator, elect a new coordinator among themselves, confirm everyone is at PRE-COMMIT, and complete DO-COMMIT. Non-blocking.
3PC under partition (the failure mode)
The network partitions between {C, P1} and {P2}. P2 is at PRE-COMMIT, does not hear DO-COMMIT, times out, elects itself, and decides to abort (wrong decision if C eventually sends DO-COMMIT). Meanwhile {C, P1} proceeds to commit. Split-brain decision, atomicity violated. This is why 3PC is not used in practice on asynchronous networks.
Paxos Commit sketch
Per participant, a Paxos group records the vote (PREPARED or NO). A final Paxos group records the decision. A participant in doubt asks the vote group; once the vote group has a majority answer, the participant proceeds. A crashed coordinator does not block anyone because there is no single coordinator; there is a replicated log of decisions. Under partitions, only a side that can reach a majority proceeds; the minority side stalls (CAP availability loss) rather than proceeding to an inconsistent decision (safety loss).
Common Confusion / Misconception
"3PC fixes 2PC's problems." Only under the synchronous assumption. On real networks (asynchronous, partitionable), 3PC is unsafe. Most textbooks glide over this; the literature is clearer.
"Paxos Commit is just Paxos." It is Paxos applied to the commit decision and the votes. The underlying transactional work on each participant still looks like 2PC's prepare/commit.
"Paxos Commit avoids blocking entirely." Under partition, it still stalls the minority side. The trade is from "blocked forever if coordinator dies" to "unable to progress without a majority." That is the honest atomic-commit cost on asynchronous networks.
"Modern databases use 3PC." They do not. They use 2PC plus consensus-replicated coordinator state, which is essentially Paxos Commit or a close cousin.
How To Use It
When reading about a distributed database's commit protocol, ask:
- Is there a single coordinator? If yes, is its state replicated, and by what mechanism?
- What does the protocol do if the coordinator fails at each phase?
- What happens under network partition: does the minority side stall, or does it take action that could violate atomicity?
- Is the protocol called "3PC"? If so, the marketing is ahead of the engineering. Ask harder questions.
Check Yourself
- What does 3PC's pre-commit phase add that 2PC lacks?
- Under what network model is 3PC safe? Why does real Ethernet fail that model?
- What does Paxos Commit replace: the coordinator, the participants, or both?
- Why does Paxos Commit still stall the minority side during a partition?
- Why do modern distributed databases rarely implement 3PC?
Mini Drill or Application
For each failure, predict whether the protocol blocks, aborts, or commits:
- 2PC, coordinator crashes after all PREPARED votes logged, before sending COMMIT.
- 3PC on a synchronous network, coordinator dies after PRE-COMMIT reached all.
- 3PC, network partitions after PRE-COMMIT reached only one of two participants.
- Paxos Commit, one acceptor out of five crashes during the decision phase.
- Paxos Commit, network partitions into 2/5 and 3/5 during decision.
Read This Only If Stuck
- DDIA: Distributed transactions in practice (part 1)
- DDIA: Distributed transactions in practice (part 2)
- Database Internals: Coordinator failures in 3PC
- Database Internals: Distributed transactions with Spanner
- Distributed Systems: Atomic commit protocols (part 2)
- Distributed Systems: Atomic commit protocols (part 3)
- Database System Concepts: Commit protocols (part 3)