Isolation and Anomalies Lab
Reproduce the six core concurrency anomalies on a real PostgreSQL instance, then eliminate each with the right setting. The goal is muscle memory, not a research write-up.
Setup
You need: PostgreSQL 13+, two psql sessions (call them A and B), a throwaway database.
-- as a setup step, in any session
CREATE TABLE accounts (id TEXT PRIMARY KEY, balance INT);
INSERT INTO accounts VALUES ('a1', 500), ('a2', 500);
CREATE TABLE doctors (name TEXT PRIMARY KEY, on_call BOOLEAN);
INSERT INTO doctors VALUES ('alice', TRUE), ('bob', TRUE);
CREATE TABLE counter (id INT PRIMARY KEY, n INT);
INSERT INTO counter VALUES (1, 0);
Between anomaly reproductions, reset the data with DELETE + INSERT as needed.
Retrieval Prompts
- State from memory the definition of each of dirty read, dirty write, lost update, read skew, write skew, phantom.
- Name the weakest isolation level on PostgreSQL that prevents each.
- State what PostgreSQL's
REPEATABLE READactually is (hint: not ANSI Repeatable Read). - State what PostgreSQL's
SERIALIZABLEactually is. - State when a SERIALIZABLE transaction will raise
SQLSTATE 40001and what the client must do.
Lab 1: Lost Update (Read Committed)
In session A:
BEGIN;
SELECT n FROM counter WHERE id = 1; -- returns 0
Now in session B:
BEGIN;
SELECT n FROM counter WHERE id = 1; -- returns 0
UPDATE counter SET n = 1 WHERE id = 1; -- based on the value we read
COMMIT;
Back in session A:
UPDATE counter SET n = 1 WHERE id = 1; -- same computed value; should have been 2
COMMIT;
Final SELECT n FROM counter WHERE id = 1: observe 1 instead of 2. Lost update reproduced.
Fix 1: Raise A and B to REPEATABLE READ (BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;). Second committer gets could not serialize access due to concurrent update. Retry on the client.
Fix 2: Use atomic UPDATE counter SET n = n + 1 WHERE id = 1;. No read; no lost update possible at any level.
Fix 3: SELECT n FROM counter WHERE id = 1 FOR UPDATE; at the read step in Read Committed.
Try all three and observe the diffs.
Lab 2: Read Skew (Read Committed)
In session A:
BEGIN;
SELECT balance FROM accounts WHERE id = 'a1'; -- returns 500
In session B (commits atomically):
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'a1';
UPDATE accounts SET balance = balance + 100 WHERE id = 'a2';
COMMIT;
Back in A:
SELECT balance FROM accounts WHERE id = 'a2'; -- returns 600; sum = 1100 which never existed
COMMIT;
Fix: Raise A to REPEATABLE READ (SI in Postgres). Session A sees the snapshot from BEGIN; balance for a2 still returns 500. Total is 1000, consistent.
Lab 3: Write Skew (under REPEATABLE READ / SI)
Set both on_call = TRUE for alice and bob. Invariant: at least one on-call.
In session A, at REPEATABLE READ:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call; -- returns 2
Then in session B, same isolation:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call; -- returns 2
UPDATE doctors SET on_call = FALSE WHERE name = 'bob';
COMMIT;
In A:
UPDATE doctors SET on_call = FALSE WHERE name = 'alice';
COMMIT;
Observe both commits succeed. SELECT COUNT(*) FROM doctors WHERE on_call now returns 0. Invariant violated. Write skew reproduced under SI.
Fix: Raise both to SERIALIZABLE. One transaction will abort with could not serialize access due to read/write dependencies. Client retries; second attempt reads on_call = 1 and declines to update.
Lab 4: Phantom (via RR / SI, unique-check pattern)
Run the "reserve the meeting room" experiment from concept 5 against a table:
CREATE TABLE bookings (id SERIAL PRIMARY KEY, room INT, t_start INT, t_end INT);
Both sessions at REPEATABLE READ:
-- A:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM bookings WHERE room = 7 AND t_start < 4 AND t_end > 3; -- returns 0
-- B:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM bookings WHERE room = 7 AND t_start < 4 AND t_end > 3; -- returns 0
INSERT INTO bookings (room, t_start, t_end) VALUES (7, 3, 4);
COMMIT;
-- A:
INSERT INTO bookings (room, t_start, t_end) VALUES (7, 3, 4);
COMMIT;
Two overlapping bookings now exist. Phantom reproduced.
Fix 1: Raise both to SERIALIZABLE. One aborts.
Fix 2: Use an EXCLUSION constraint (ALTER TABLE bookings ADD EXCLUDE USING gist (room WITH =, int4range(t_start, t_end) WITH &&);). Insert fails on the second commit.
Compare and Distinguish
- Read Committed vs Repeatable Read vs Serializable: what does each allow and what does each cost?
SELECT ... FOR UPDATEvs raising isolation: when is one better?- Lost update at Read Committed vs at SI (first-committer-wins): which wins which race?
- MySQL InnoDB
REPEATABLE READvs PostgreSQLREPEATABLE READ: what is actually different?
Common Mistake Check
- "I used a transaction, so concurrent updates are safe." Demonstrate on counter.
- "I'm at Serializable, so I never need to retry." Demonstrate
SQLSTATE 40001and write the retry wrapper. - "Postgres Repeatable Read gives me ANSI Repeatable Read semantics." Show where it differs (write skew).
- "My app-level uniqueness check is enough." Show a concurrent signup race that succeeds despite the check.
Evidence Check
This lab is complete only if:
- You have a terminal log showing each of the six anomalies reproduced at a level where they are permitted, and each eliminated at a stronger level (or with a stronger primitive).
- You have written a Postgres retry loop that catches
40001and retries with backoff. - You can name, without reference, the weakest PostgreSQL setting that prevents each anomaly.