Skip to main content

Isolation and Anomalies Lab

Reproduce the six core concurrency anomalies on a real PostgreSQL instance, then eliminate each with the right setting. The goal is muscle memory, not a research write-up.

Setup

You need: PostgreSQL 13+, two psql sessions (call them A and B), a throwaway database.

-- as a setup step, in any session
CREATE TABLE accounts (id TEXT PRIMARY KEY, balance INT);
INSERT INTO accounts VALUES ('a1', 500), ('a2', 500);

CREATE TABLE doctors (name TEXT PRIMARY KEY, on_call BOOLEAN);
INSERT INTO doctors VALUES ('alice', TRUE), ('bob', TRUE);

CREATE TABLE counter (id INT PRIMARY KEY, n INT);
INSERT INTO counter VALUES (1, 0);

Between anomaly reproductions, reset the data with DELETE + INSERT as needed.

Retrieval Prompts

  1. State from memory the definition of each of dirty read, dirty write, lost update, read skew, write skew, phantom.
  2. Name the weakest isolation level on PostgreSQL that prevents each.
  3. State what PostgreSQL's REPEATABLE READ actually is (hint: not ANSI Repeatable Read).
  4. State what PostgreSQL's SERIALIZABLE actually is.
  5. State when a SERIALIZABLE transaction will raise SQLSTATE 40001 and what the client must do.

Lab 1: Lost Update (Read Committed)

In session A:

BEGIN;
SELECT n FROM counter WHERE id = 1; -- returns 0

Now in session B:

BEGIN;
SELECT n FROM counter WHERE id = 1; -- returns 0
UPDATE counter SET n = 1 WHERE id = 1; -- based on the value we read
COMMIT;

Back in session A:

UPDATE counter SET n = 1 WHERE id = 1;  -- same computed value; should have been 2
COMMIT;

Final SELECT n FROM counter WHERE id = 1: observe 1 instead of 2. Lost update reproduced.

Fix 1: Raise A and B to REPEATABLE READ (BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;). Second committer gets could not serialize access due to concurrent update. Retry on the client.

Fix 2: Use atomic UPDATE counter SET n = n + 1 WHERE id = 1;. No read; no lost update possible at any level.

Fix 3: SELECT n FROM counter WHERE id = 1 FOR UPDATE; at the read step in Read Committed.

Try all three and observe the diffs.

Lab 2: Read Skew (Read Committed)

In session A:

BEGIN;
SELECT balance FROM accounts WHERE id = 'a1'; -- returns 500

In session B (commits atomically):

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'a1';
UPDATE accounts SET balance = balance + 100 WHERE id = 'a2';
COMMIT;

Back in A:

SELECT balance FROM accounts WHERE id = 'a2';  -- returns 600; sum = 1100 which never existed
COMMIT;

Fix: Raise A to REPEATABLE READ (SI in Postgres). Session A sees the snapshot from BEGIN; balance for a2 still returns 500. Total is 1000, consistent.

Lab 3: Write Skew (under REPEATABLE READ / SI)

Set both on_call = TRUE for alice and bob. Invariant: at least one on-call.

In session A, at REPEATABLE READ:

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call; -- returns 2

Then in session B, same isolation:

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call; -- returns 2
UPDATE doctors SET on_call = FALSE WHERE name = 'bob';
COMMIT;

In A:

UPDATE doctors SET on_call = FALSE WHERE name = 'alice';
COMMIT;

Observe both commits succeed. SELECT COUNT(*) FROM doctors WHERE on_call now returns 0. Invariant violated. Write skew reproduced under SI.

Fix: Raise both to SERIALIZABLE. One transaction will abort with could not serialize access due to read/write dependencies. Client retries; second attempt reads on_call = 1 and declines to update.

Lab 4: Phantom (via RR / SI, unique-check pattern)

Run the "reserve the meeting room" experiment from concept 5 against a table:

CREATE TABLE bookings (id SERIAL PRIMARY KEY, room INT, t_start INT, t_end INT);

Both sessions at REPEATABLE READ:

-- A:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM bookings WHERE room = 7 AND t_start < 4 AND t_end > 3; -- returns 0

-- B:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM bookings WHERE room = 7 AND t_start < 4 AND t_end > 3; -- returns 0
INSERT INTO bookings (room, t_start, t_end) VALUES (7, 3, 4);
COMMIT;

-- A:
INSERT INTO bookings (room, t_start, t_end) VALUES (7, 3, 4);
COMMIT;

Two overlapping bookings now exist. Phantom reproduced.

Fix 1: Raise both to SERIALIZABLE. One aborts.

Fix 2: Use an EXCLUSION constraint (ALTER TABLE bookings ADD EXCLUDE USING gist (room WITH =, int4range(t_start, t_end) WITH &&);). Insert fails on the second commit.

Compare and Distinguish

  • Read Committed vs Repeatable Read vs Serializable: what does each allow and what does each cost?
  • SELECT ... FOR UPDATE vs raising isolation: when is one better?
  • Lost update at Read Committed vs at SI (first-committer-wins): which wins which race?
  • MySQL InnoDB REPEATABLE READ vs PostgreSQL REPEATABLE READ: what is actually different?

Common Mistake Check

  1. "I used a transaction, so concurrent updates are safe." Demonstrate on counter.
  2. "I'm at Serializable, so I never need to retry." Demonstrate SQLSTATE 40001 and write the retry wrapper.
  3. "Postgres Repeatable Read gives me ANSI Repeatable Read semantics." Show where it differs (write skew).
  4. "My app-level uniqueness check is enough." Show a concurrent signup race that succeeds despite the check.

Evidence Check

This lab is complete only if:

  • You have a terminal log showing each of the six anomalies reproduced at a level where they are permitted, and each eliminated at a stronger level (or with a stronger primitive).
  • You have written a Postgres retry loop that catches 40001 and retries with backoff.
  • You can name, without reference, the weakest PostgreSQL setting that prevents each anomaly.