Skip to main content

Two-Phase Locking (2PL) and Serialized Schedules

What This Concept Is

Two-phase locking is the classic mechanism for achieving serializability. The rule:

  1. A transaction acquires locks as it needs them.
  2. Once a transaction releases any lock, it cannot acquire any new lock.

The lifecycle has two phases: a growing phase (acquiring only) and a shrinking phase (releasing only). This discipline guarantees that any schedule of transactions following 2PL is conflict-serializable: equivalent to some serial execution.

Locks come in modes:

  • Shared (S): for reads. Multiple transactions can hold S on the same item.
  • Exclusive (X): for writes. Only one transaction at a time.
  • Intent locks (IS, IX, SIX) for hierarchical locking (row vs page vs table).

Strict 2PL additionally requires all write locks be held until commit/abort (releases happen atomically at transaction end). This ensures recoverable schedules and prevents cascading aborts.

Predicate locks (or next-key/gap locks) are the variant that handles phantoms: you lock the condition (e.g., "all rows with on_call = true"), not just the rows that currently match. Engines approximate this with index range locks (MySQL InnoDB gap locks).

Why It Matters Here

2PL is the oldest answer to "how do you make transactions look serial" and the ground truth for reasoning about concurrency control. Many databases (MSSQL SERIALIZABLE, MySQL InnoDB SERIALIZABLE, older DB2) implement it directly. Even when the engine uses MVCC for reads and 2PL only for writes, understanding 2PL tells you how deadlocks happen and why hot rows serialize a workload.

Concrete Example

Two transactions, strict 2PL, over rows x and y (both initially 0).

time  T1                                T2
1 S-lock(x); read x -> 0
2 X-lock(y); write y = 1
3 X-lock(y) -- BLOCKS; T2 has it
4 S-lock(x) -- BLOCKS; T1 has it on S

Deadlock at time 4: T1 waits for y, T2 waits for x. The engine detects the cycle (waits-for graph) and aborts one. In strict 2PL this is the usual cost: deadlocks show up, and the engine's detector kills the younger or the smaller.

A successful serial-equivalent run under strict 2PL:

time  T1                                T2
1 X-lock(x); write x = 1
2 X-lock(y); write y = 1
3 COMMIT (releases all X-locks atomically)
4 X-lock(x); write x = 2
5 X-lock(y); write y = 2
6 COMMIT

Schedule is equivalent to T1 then T2. Serializable.

Common Confusion / Misconception

"2PL = all locks acquired up front." That is conservative 2PL, one variant. Standard 2PL allows incremental acquisition; the rule is that once you release, you cannot acquire. Acquiring up front avoids deadlocks but sacrifices parallelism.

"2PL prevents deadlocks." It does not. It guarantees serializability, but deadlocks happen and must be detected (waits-for graph) or timed out.

"Strict 2PL means you can never read what you just wrote." Strict 2PL holds write locks until commit; you can still read what you wrote (you hold the lock). What you cannot do is release write locks mid-transaction.

"2PL is ancient and irrelevant." It is the correctness ground truth. Every isolation level is judged against serializable schedules, which 2PL was invented to produce.

How To Use It

When a 2PL engine is slow or deadlocking:

  1. Identify hot rows: rows locked by most transactions. They serialize everything.
  2. Reduce lock scope: smaller transactions hold locks for less time.
  3. Consistent lock order: acquire in the same order across transactions to avoid cycles.
  4. Use coarser or finer granularity: row-locks reduce false contention; coarse table locks reduce deadlock risk at cost of throughput.
  5. If read-heavy, consider MVCC for reads (Snapshot Isolation) and reserve 2PL for writes.

Check Yourself

  1. Why does the "no new lock after any release" rule guarantee serializability?
  2. Distinguish 2PL from strict 2PL. What property does strict 2PL add?
  3. What is a predicate lock, and why is it needed for phantom-proof serializability?
  4. What are the two main mechanisms engines use to handle deadlocks under 2PL?
  5. A workload is 99% reads and 1% writes. What does 2PL do to read throughput, and which mechanism would you replace it with?

Mini Drill or Application

For each schedule, say whether it is legal under strict 2PL:

  1. T1: X(x), W(x), REL(x); T1: X(y), W(y), COMMIT
  2. T1: S(x), R(x); T2: X(x) (waits); T1: COMMIT; T2: W(x), COMMIT
  3. T1: X(x), W(x); T2: X(y), W(y); T1: X(y) (blocks); T2: X(x) (blocks) -- deadlock
  4. T1: S(x), R(x), REL(x), X(y), W(y), COMMIT
  5. T1: X(x), W(x), COMMIT; T2: X(x), W(x), COMMIT

Read This Only If Stuck