Skip to main content

Isolation Levels: RU, RC, RR, Serializable

What This Concept Is

The ANSI SQL standard names four isolation levels, defined by which anomalies each prevents:

LevelDirty readDirty writeLost updateRead skewWrite skewPhantom
Read Uncommittedpossiblepossible*possiblepossiblepossiblepossible
Read Committedpreventedpreventedpossiblepossiblepossiblepossible
Repeatable Read (ANSI)preventedpreventedengine-dependentpreventedpossiblepossible
Serializablepreventedpreventedpreventedpreventedpreventedprevented

(Most real engines block dirty write at any level.)

But the spec is famously underspecified. The Berenson et al. paper ("A Critique of ANSI SQL Isolation Levels") showed the anomaly-based definitions are incomplete; most engines define their own levels in practice. Notable shifts:

  • Oracle and PostgreSQL's REPEATABLE READ is actually Snapshot Isolation (prevents lost update via first-committer-wins, still allows write skew).
  • PostgreSQL's SERIALIZABLE is Serializable Snapshot Isolation (SSI), which gives real serializability.
  • MySQL InnoDB's REPEATABLE READ is SI-like with gap locks, which prevents phantoms but still allows write skew in some cases.
  • MySQL InnoDB's SERIALIZABLE escalates to 2PL.
  • SQL Server's SERIALIZABLE is 2PL; its SNAPSHOT is SI.

Why It Matters Here

Choosing an isolation level is a correctness decision disguised as a knob. Most teams either pick "Read Committed because it's the default" without checking, or pick "Serializable because serializable sounds safe" without measuring. Both choices hide a question about invariants. To choose honestly you have to know what each level prevents and what your engine actually implements under the name.

Concrete Example

A workload with three transaction shapes:

  • T_transfer: transfer between accounts (multi-row write).
  • T_report: compute total balance across all accounts.
  • T_add_user: insert a row into users with a unique-email check.

Under each isolation level:

  • Read Committed: T_report can observe a total that never existed (read skew during T_transfer). T_add_user can double-insert if two run concurrently (phantom). T_transfer can lose updates if it reads-modifies-writes balances.
  • Repeatable Read (SI in Postgres): T_report sees a consistent snapshot. Lost update prevented by first-committer-wins. Phantom on unique-email check still possible unless enforced by unique index. Write skew still possible.
  • Serializable (SSI in Postgres): all the above prevented. Cost: transactions may abort with a serialization failure, and you must retry. Throughput drops under high contention.

The right level depends on which of these anomalies the application must rule out.

Common Confusion / Misconception

"Higher isolation is always safer." It is safer against anomalies, but at the cost of throughput (more aborts, more locks, more contention). "Serializable everywhere" is often a caricature.

"The standard says what each level does." The standard's text is ambiguous; read your engine's docs.

"Repeatable Read on my engine prevents phantoms." Only on some engines. PostgreSQL's REPEATABLE READ does not fully prevent them in the write-skew sense; MySQL InnoDB's does via gap locks; the ANSI standard is silent.

"SET TRANSACTION ISOLATION LEVEL ... is enough." Only if you also handle the serialization failures that stronger levels throw. Serializable transactions in Postgres can abort with 40001; your client must retry. Tests often pass because nothing retries; production fails when load shows up.

How To Use It

For each transaction type in your application:

  1. List the invariants it reads and the invariants it writes.
  2. Ask which anomaly would break any of those invariants.
  3. Pick the weakest level that prevents that anomaly on your engine.
  4. If you pick Serializable, write a retry loop around the transaction.
  5. If you pick SI/Repeatable Read and an invariant spans predicate reads, add explicit SELECT ... FOR UPDATE or materialize a conflict row to force a write-write conflict.

Document the level and the reasoning next to the transaction code.

Check Yourself

  1. Which isolation level does PostgreSQL's SERIALIZABLE implement, and what does it do that REPEATABLE READ (same engine) does not?
  2. Why is the anomaly-based SQL standard definition of isolation levels considered incomplete (Berenson et al.)?
  3. What is the main operational cost of running at Serializable under contention?
  4. Give a case where Read Committed is actually correct and Serializable is overkill.

Mini Drill or Application

For each transaction, pick the weakest isolation level (on PostgreSQL) that preserves correctness, and justify:

  1. A read-only dashboard summing balances with no other writes.
  2. A funds transfer UPDATE A SET bal = bal - 100; UPDATE B SET bal = bal + 100;.
  3. A "pick an unused username" sign-up using SELECT then INSERT.
  4. A "one-on-call doctor at all times" invariant across two concurrent toggles.
  5. An idempotent counter incremented via UPDATE c SET n = n + 1 WHERE id = $id.

Read This Only If Stuck