Locking-Based Concurrency: 2PL, Lock Escalation
What This Concept Is
When multiple transactions touch the same data, the engine must decide who can do what and in what order. Lock-based concurrency control hands out permits (locks) that transactions must hold while accessing data.
The two primary lock modes:
- Shared (S): multiple holders allowed, all reading. Incompatible with
X. - Exclusive (X): single holder, can write. Incompatible with
SandX.
A compatibility matrix:
held S held X
req S grant wait
req X wait wait
Two-Phase Locking (2PL). A transaction operates in two phases:
- Growing phase: it acquires locks; it may not release any.
- Shrinking phase: once it has released a lock, it may not acquire another.
Theorem: any schedule produced by 2PL is conflict-serializable. That is the whole reason 2PL is correct. Most engines use strict 2PL, where all exclusive locks are held until commit or abort; this additionally guarantees recoverability and prevents cascading aborts.
Lock granularity. Locks can be taken at many levels: row, page, table, database. Finer granularity gives more concurrency but more bookkeeping. Engines use intention locks (IS, IX, SIX) on ancestors to coordinate between table-level and row-level requests without scanning every row's lock state.
Lock escalation. When one transaction accumulates many row-level locks (say, >5000), the engine may convert them to a single table-level lock to save memory. This trades concurrency for bookkeeping cost -- exactly when you didn't want it.
Deadlock. 2PL can deadlock: T1 holds X(A) and waits for B; T2 holds X(B) and waits for A. Nobody makes progress. Engines detect deadlock via a wait-for graph and abort one of the victims, or prevent deadlock with wait-die/wound-wait timestamp schemes.
Why It Matters Here
2PL is the concurrency model most relational engines used for decades (SQL Server still does by default; many others use 2PL for writes and MVCC for reads). Even in MVCC engines, locks exist under the hood for certain operations (schema changes, constraint checks, index maintenance).
If you do not understand 2PL, you cannot read lock-monitoring views, diagnose deadlocks, or reason about why a long-running transaction blocks everything behind it.
Concrete Example -- Conflict and Wait
T1: BEGIN
T1: UPDATE accounts SET balance = balance - 100 WHERE id = 7 -- acquires X(row 7)
T2: BEGIN
T2: SELECT balance FROM accounts WHERE id = 7 -- requests S(row 7), waits
T1: UPDATE accounts SET balance = balance + 100 WHERE id = 9 -- acquires X(row 9)
T1: COMMIT -- releases X(7), X(9)
T2: (S(7) granted, read proceeds)
T2's read is delayed by T1's uncommitted write because strict 2PL holds writer locks until commit.
Concrete Example -- Deadlock
T1: BEGIN; UPDATE A ... -- X(A)
T2: BEGIN; UPDATE B ... -- X(B)
T1: UPDATE B ... -- requests X(B), waits on T2
T2: UPDATE A ... -- requests X(A), waits on T1
Wait-for graph: T1 -> T2 -> T1. Cycle. The deadlock detector picks a victim (usually the younger or the one holding fewer locks) and aborts it. The surviving transaction proceeds.
Real engines log the victim and the cycle; seeing "deadlock detected" in the log means 2PL is doing its job, not that it is broken.
Concrete Example -- Escalation Pathology
T1: DELETE FROM events WHERE ts < now() - interval '1 year' -- matches 10^6 rows
Per-row X locks at ~100 bytes each would be ~100 MB of lock metadata. The engine escalates to an X(events) table lock. Now every other transaction touching events, even innocent readers, must wait until T1 commits.
Mitigations: break the delete into chunks (DELETE ... LIMIT 10000 in a loop), adjust escalation thresholds, or use partitioning so the bulk delete operates on a different partition from active writers.
Common Confusion / Misconception
"Strict 2PL prevents all anomalies." It prevents conflict anomalies (non-serializable schedules). It cannot prevent things that are correct under serializability but still surprise users -- like phantom reads via predicate queries. Engines use predicate locks or next-key locks to handle phantoms.
"Locks are bad for performance." Locks are how the engine trades concurrency for correctness; removing locks without replacing the guarantee gives you nothing. The real question is always which operations hold which locks for how long.
How To Use It
When diagnosing blocking:
- Find the blocking transaction (
pg_stat_activity,sys.dm_tran_locks,SHOW ENGINE INNODB STATUS). - Identify lock held, mode, and duration.
- Ask whether the transaction is long because of application logic, network wait, or slow query inside.
- If it is a legitimate long writer, consider batching, chunking, or switching to an MVCC engine / isolation level.
- For repeated deadlocks, look at the order of lock acquisition across transactions -- most deadlocks are fixed by establishing a canonical lock order.
Check Yourself
- Why do the two phases of 2PL guarantee conflict serializability?
- Why does strict 2PL hold
Xlocks until commit, and what does that prevent? - Why is lock escalation sometimes worse than holding many fine-grained locks?
Mini Drill or Application
Given a schedule, determine which transactions conflict under strict 2PL and whether a deadlock forms:
T1: r(A), w(B); T2: r(A), w(C)-- any conflict?T1: w(A); T2: r(A); T1: commit; T2: commit-- what waits?T1: r(A), w(B); T2: r(B), w(A)-- deadlock possibility?T1: UPDATE ... 10^6 rowsunder default escalation -- what locks are held?T1: SELECT ... WHERE status = 'OPEN' FOR UPDATE; T2: INSERT new row with status='OPEN'-- is there a phantom risk, and what prevents it?
Read This Only If Stuck
- Database Internals: Lock-Based Concurrency Control (Part 1)
- Database Internals: Lock-Based Concurrency Control (Part 2)
- Database Internals: Read and Write Anomalies
- Database System Concepts: 18.1 Lock-Based Protocols (Part 1)
- Database System Concepts: 18.1 Lock-Based Protocols (Part 3)
- Database System Concepts: 18.3 Multiple Granularity (Part 1)
- Database System Concepts: 18.2 Deadlock Handling (Part 1)