Skip to main content

Locking-Based Concurrency: 2PL, Lock Escalation

What This Concept Is

When multiple transactions touch the same data, the engine must decide who can do what and in what order. Lock-based concurrency control hands out permits (locks) that transactions must hold while accessing data.

The two primary lock modes:

  • Shared (S): multiple holders allowed, all reading. Incompatible with X.
  • Exclusive (X): single holder, can write. Incompatible with S and X.

A compatibility matrix:

         held S   held X
req S grant wait
req X wait wait

Two-Phase Locking (2PL). A transaction operates in two phases:

  1. Growing phase: it acquires locks; it may not release any.
  2. Shrinking phase: once it has released a lock, it may not acquire another.

Theorem: any schedule produced by 2PL is conflict-serializable. That is the whole reason 2PL is correct. Most engines use strict 2PL, where all exclusive locks are held until commit or abort; this additionally guarantees recoverability and prevents cascading aborts.

Lock granularity. Locks can be taken at many levels: row, page, table, database. Finer granularity gives more concurrency but more bookkeeping. Engines use intention locks (IS, IX, SIX) on ancestors to coordinate between table-level and row-level requests without scanning every row's lock state.

Lock escalation. When one transaction accumulates many row-level locks (say, >5000), the engine may convert them to a single table-level lock to save memory. This trades concurrency for bookkeeping cost -- exactly when you didn't want it.

Deadlock. 2PL can deadlock: T1 holds X(A) and waits for B; T2 holds X(B) and waits for A. Nobody makes progress. Engines detect deadlock via a wait-for graph and abort one of the victims, or prevent deadlock with wait-die/wound-wait timestamp schemes.

Why It Matters Here

2PL is the concurrency model most relational engines used for decades (SQL Server still does by default; many others use 2PL for writes and MVCC for reads). Even in MVCC engines, locks exist under the hood for certain operations (schema changes, constraint checks, index maintenance).

If you do not understand 2PL, you cannot read lock-monitoring views, diagnose deadlocks, or reason about why a long-running transaction blocks everything behind it.

Concrete Example -- Conflict and Wait

T1: BEGIN
T1: UPDATE accounts SET balance = balance - 100 WHERE id = 7 -- acquires X(row 7)
T2: BEGIN
T2: SELECT balance FROM accounts WHERE id = 7 -- requests S(row 7), waits
T1: UPDATE accounts SET balance = balance + 100 WHERE id = 9 -- acquires X(row 9)
T1: COMMIT -- releases X(7), X(9)
T2: (S(7) granted, read proceeds)

T2's read is delayed by T1's uncommitted write because strict 2PL holds writer locks until commit.

Concrete Example -- Deadlock

T1: BEGIN; UPDATE A ...                  -- X(A)
T2: BEGIN; UPDATE B ... -- X(B)
T1: UPDATE B ... -- requests X(B), waits on T2
T2: UPDATE A ... -- requests X(A), waits on T1

Wait-for graph: T1 -> T2 -> T1. Cycle. The deadlock detector picks a victim (usually the younger or the one holding fewer locks) and aborts it. The surviving transaction proceeds.

Real engines log the victim and the cycle; seeing "deadlock detected" in the log means 2PL is doing its job, not that it is broken.

Concrete Example -- Escalation Pathology

T1: DELETE FROM events WHERE ts < now() - interval '1 year'  -- matches 10^6 rows

Per-row X locks at ~100 bytes each would be ~100 MB of lock metadata. The engine escalates to an X(events) table lock. Now every other transaction touching events, even innocent readers, must wait until T1 commits.

Mitigations: break the delete into chunks (DELETE ... LIMIT 10000 in a loop), adjust escalation thresholds, or use partitioning so the bulk delete operates on a different partition from active writers.

Common Confusion / Misconception

"Strict 2PL prevents all anomalies." It prevents conflict anomalies (non-serializable schedules). It cannot prevent things that are correct under serializability but still surprise users -- like phantom reads via predicate queries. Engines use predicate locks or next-key locks to handle phantoms.

"Locks are bad for performance." Locks are how the engine trades concurrency for correctness; removing locks without replacing the guarantee gives you nothing. The real question is always which operations hold which locks for how long.

How To Use It

When diagnosing blocking:

  1. Find the blocking transaction (pg_stat_activity, sys.dm_tran_locks, SHOW ENGINE INNODB STATUS).
  2. Identify lock held, mode, and duration.
  3. Ask whether the transaction is long because of application logic, network wait, or slow query inside.
  4. If it is a legitimate long writer, consider batching, chunking, or switching to an MVCC engine / isolation level.
  5. For repeated deadlocks, look at the order of lock acquisition across transactions -- most deadlocks are fixed by establishing a canonical lock order.

Check Yourself

  1. Why do the two phases of 2PL guarantee conflict serializability?
  2. Why does strict 2PL hold X locks until commit, and what does that prevent?
  3. Why is lock escalation sometimes worse than holding many fine-grained locks?

Mini Drill or Application

Given a schedule, determine which transactions conflict under strict 2PL and whether a deadlock forms:

  1. T1: r(A), w(B); T2: r(A), w(C) -- any conflict?
  2. T1: w(A); T2: r(A); T1: commit; T2: commit -- what waits?
  3. T1: r(A), w(B); T2: r(B), w(A) -- deadlock possibility?
  4. T1: UPDATE ... 10^6 rows under default escalation -- what locks are held?
  5. T1: SELECT ... WHERE status = 'OPEN' FOR UPDATE; T2: INSERT new row with status='OPEN' -- is there a phantom risk, and what prevents it?

Read This Only If Stuck