Serializable Snapshot Isolation (SSI)
What This Concept Is
SSI is Snapshot Isolation with an extra runtime check that detects patterns that could produce non-serializable outcomes and aborts one of the offending transactions. The seminal paper is Cahill, Rohm, Fekete (2008); PostgreSQL's SERIALIZABLE (since 9.1) is an SSI implementation.
Key insight: under SI, the only non-serializable patterns (write skew and SI-specific phantoms) manifest as rw-antidependencies: transaction T1 reads data that transaction T2 writes, where T1 runs concurrently with T2 and did not see T2's write. A dangerous structure is formed when two such antidependencies exist in a cycle: T1 --rw--> T2 --rw--> T3 where T3 committed before T1 (or T1 = T3). Cahill et al. proved such cycles are necessary for non-serializable outcomes under SI.
SSI tracks rw-antidependencies and aborts a transaction when it sees both a pivot (a transaction with an incoming and an outgoing rw-antidependency) commit in the wrong order.
Crucially, SSI is optimistic: it does not block. Transactions proceed on their snapshot; on commit the engine checks the antidependency graph. On a dangerous pattern one transaction gets could not serialize access and the client must retry.
Why It Matters Here
SSI is how Postgres gives you real serializability without the lock contention of 2PL. It preserves the scale benefits of MVCC for the normal case and pays the cost (aborts) only on patterns that truly require serialization. In practice this is far cheaper than 2PL for most OLTP mixes, which is why Postgres chose it. It is also a decent mental model for optimistic concurrency control in general.
Concrete Example
Write-skew from concept 5 under SSI. Alice and Bob each read on_call = true count, each wants to set themselves off-call.
time T1 (Alice) T2 (Bob)
1 BEGIN SERIALIZABLE; snap = S1
2 BEGIN SERIALIZABLE; snap = S2
3 SELECT COUNT(*) FROM doctors
WHERE on_call -> 2
(T1 forms a read set including both Alice's row and Bob's row)
4 SELECT COUNT(*) FROM doctors
WHERE on_call -> 2
(T2 reads Alice and Bob rows too)
5 UPDATE Alice SET on_call=false
(T1 writes Alice; T2 had read Alice
-> rw-antidependency T2 --rw--> T1)
6 UPDATE Bob SET on_call=false
(T2 writes Bob; T1 had read Bob
-> rw-antidependency T1 --rw--> T2)
7 COMMIT
8 COMMIT
-- cycle detected: T1 --rw--> T2 --rw--> T1 --
-- one aborted with SQLSTATE 40001 --
Under plain SI, step 7 and 8 both succeed, breaking the invariant. Under SSI, one transaction commits, the other aborts with a serialization failure, and the client retries.
Common Confusion / Misconception
"SSI adds locks." No. SSI is lockless. It tracks read sets and write sets and the resulting antidependency graph; transactions proceed without blocking.
"SSI is free on top of SI." Tracking read sets has overhead and memory cost. On very high-throughput workloads the engine may coarsen tracking (e.g., per-page instead of per-row), which increases false positives (aborts when serializability was not actually violated).
"A serialization failure means a bug in my transaction." It means your transaction pattern could have produced a non-serializable result. Your code must retry. A production-quality Serializable transaction in Postgres is always wrapped in a retry loop.
"SSI prevents phantoms." Yes, in the sense that predicate reads contribute to the read set, and concurrent inserts that match the predicate produce rw-antidependencies that trigger aborts.
How To Use It
When using PostgreSQL SERIALIZABLE:
- Wrap every Serializable transaction in a retry loop that catches
SQLSTATE 40001/40P01and retries with backoff. - Keep transactions short; long transactions accumulate large read sets and increase abort rates.
- Measure abort rates as a first-class metric. If above ~5%, either reduce contention or fall back to explicit locking on the offending path.
- Prefer Serializable when invariants span multiple rows and you cannot express them as single-row constraints.
- For pure read-modify-write on a single row, SI with atomic UPDATE is usually sufficient and cheaper.
Check Yourself
- What is a rw-antidependency, and why is it the only new kind of conflict SSI needs to detect beyond SI?
- Why must application code be prepared to retry Serializable transactions?
- What is the tradeoff between fine-grained and coarse-grained read-set tracking in SSI?
- Give a schedule that SSI would abort but SI would commit.
Mini Drill or Application
For each schedule, predict under PostgreSQL SERIALIZABLE whether both commit or one aborts:
T1 reads x=5, writes y=10; T2 reads y=0, writes x=10; T1 commits; T2 commits.T1 reads all rows where flag=true (count=3); T1 inserts a row with flag=true; T2 does the same concurrently; both commit.T1 reads x; T1 writes y; T2 reads y; T2 writes x; T1 commits before T2.T1 reads x and writes x=x+1; T2 reads x and writes x=x+1(both at SERIALIZABLE).T1 is purely read-only, reads a consistent snapshot; T2 does multiple writes and commits concurrently.