Isolation Levels: RU, RC, RR, Serializable
What This Concept Is
The ANSI SQL standard names four isolation levels, defined by which anomalies each prevents:
| Level | Dirty read | Dirty write | Lost update | Read skew | Write skew | Phantom |
|---|---|---|---|---|---|---|
| Read Uncommitted | possible | possible* | possible | possible | possible | possible |
| Read Committed | prevented | prevented | possible | possible | possible | possible |
| Repeatable Read (ANSI) | prevented | prevented | engine-dependent | prevented | possible | possible |
| Serializable | prevented | prevented | prevented | prevented | prevented | prevented |
(Most real engines block dirty write at any level.)
But the spec is famously underspecified. The Berenson et al. paper ("A Critique of ANSI SQL Isolation Levels") showed the anomaly-based definitions are incomplete; most engines define their own levels in practice. Notable shifts:
- Oracle and PostgreSQL's
REPEATABLE READis actually Snapshot Isolation (prevents lost update via first-committer-wins, still allows write skew). - PostgreSQL's
SERIALIZABLEis Serializable Snapshot Isolation (SSI), which gives real serializability. - MySQL InnoDB's
REPEATABLE READis SI-like with gap locks, which prevents phantoms but still allows write skew in some cases. - MySQL InnoDB's
SERIALIZABLEescalates to 2PL. - SQL Server's
SERIALIZABLEis 2PL; itsSNAPSHOTis SI.
Why It Matters Here
Choosing an isolation level is a correctness decision disguised as a knob. Most teams either pick "Read Committed because it's the default" without checking, or pick "Serializable because serializable sounds safe" without measuring. Both choices hide a question about invariants. To choose honestly you have to know what each level prevents and what your engine actually implements under the name.
Concrete Example
A workload with three transaction shapes:
T_transfer: transfer between accounts (multi-row write).T_report: compute total balance across all accounts.T_add_user: insert a row intouserswith a unique-email check.
Under each isolation level:
- Read Committed:
T_reportcan observe a total that never existed (read skew duringT_transfer).T_add_usercan double-insert if two run concurrently (phantom).T_transfercan lose updates if it reads-modifies-writes balances. - Repeatable Read (SI in Postgres):
T_reportsees a consistent snapshot. Lost update prevented by first-committer-wins. Phantom on unique-email check still possible unless enforced by unique index. Write skew still possible. - Serializable (SSI in Postgres): all the above prevented. Cost: transactions may abort with a serialization failure, and you must retry. Throughput drops under high contention.
The right level depends on which of these anomalies the application must rule out.
Common Confusion / Misconception
"Higher isolation is always safer." It is safer against anomalies, but at the cost of throughput (more aborts, more locks, more contention). "Serializable everywhere" is often a caricature.
"The standard says what each level does." The standard's text is ambiguous; read your engine's docs.
"Repeatable Read on my engine prevents phantoms." Only on some engines. PostgreSQL's REPEATABLE READ does not fully prevent them in the write-skew sense; MySQL InnoDB's does via gap locks; the ANSI standard is silent.
"SET TRANSACTION ISOLATION LEVEL ... is enough." Only if you also handle the serialization failures that stronger levels throw. Serializable transactions in Postgres can abort with 40001; your client must retry. Tests often pass because nothing retries; production fails when load shows up.
How To Use It
For each transaction type in your application:
- List the invariants it reads and the invariants it writes.
- Ask which anomaly would break any of those invariants.
- Pick the weakest level that prevents that anomaly on your engine.
- If you pick Serializable, write a retry loop around the transaction.
- If you pick SI/Repeatable Read and an invariant spans predicate reads, add explicit
SELECT ... FOR UPDATEor materialize a conflict row to force a write-write conflict.
Document the level and the reasoning next to the transaction code.
Check Yourself
- Which isolation level does PostgreSQL's
SERIALIZABLEimplement, and what does it do thatREPEATABLE READ(same engine) does not? - Why is the anomaly-based SQL standard definition of isolation levels considered incomplete (Berenson et al.)?
- What is the main operational cost of running at Serializable under contention?
- Give a case where Read Committed is actually correct and Serializable is overkill.
Mini Drill or Application
For each transaction, pick the weakest isolation level (on PostgreSQL) that preserves correctness, and justify:
- A read-only dashboard summing balances with no other writes.
- A funds transfer
UPDATE A SET bal = bal - 100; UPDATE B SET bal = bal + 100;. - A "pick an unused username" sign-up using
SELECTthenINSERT. - A "one-on-call doctor at all times" invariant across two concurrent toggles.
- An idempotent counter incremented via
UPDATE c SET n = n + 1 WHERE id = $id.
Read This Only If Stuck
- DDIA: Weak isolation levels
- DDIA: Read committed
- DDIA: Snapshot Isolation and Repeatable Read (part 1)
- Database System Concepts: Transaction isolation levels
- Database System Concepts: Implementation of isolation levels
- External: Berenson et al., "A Critique of ANSI SQL Isolation Levels" (Microsoft Research)
- External: Jepsen: PostgreSQL isolation