Transactions at the SQL Level: `BEGIN`, `COMMIT`, `ROLLBACK`, Isolation Preview

What This Concept Is

A transaction is a unit of work that moves the database from one consistent state to another. Inside one, a sequence of statements is treated as one atomic change with respect to other transactions and to crashes.

The SQL interface:

BEGIN (or START TRANSACTION) opens a transaction.
COMMIT makes its effects durable and visible to others.
ROLLBACK discards every change since BEGIN.
SAVEPOINT name / ROLLBACK TO SAVEPOINT name let you unwind partially without losing the whole transaction.

The classical properties -- ACID:

Atomic: all statements succeed or none do.
Consistent: every committed transaction leaves integrity constraints satisfied.
Isolated: concurrent transactions do not see each other's partial work (bounded by the isolation level).
Durable: once committed, changes survive crashes.

Isolation levels (SQL standard, used by PostgreSQL):

READ UNCOMMITTED: dirty reads allowed. PostgreSQL treats this as READ COMMITTED.
READ COMMITTED (PG default): you never see uncommitted data, but rows can change between two reads in the same transaction.
REPEATABLE READ: within one transaction, every read of a row returns the same value (no non-repeatable reads). PG additionally prevents phantom reads at this level.
SERIALIZABLE: the result must be equivalent to some serial order of the concurrent transactions. Strongest; can abort transactions with a serialization error.

Deeper treatment is in future semester modules (concurrency control, MVCC). This concept is the operational preview.

Why It Matters Here (connection to primary)

Every multi-statement change (insert order + charge balance + log event) needs a transaction or you will ship a bug the moment something fails between statements.
Most production bugs labeled "race condition" at the SQL layer are one of: no transaction at all, the wrong isolation level, or a long-running transaction holding locks.
Later modules treat MVCC, locking, and distributed transactions. You cannot follow those if you do not already know the SQL-level surface.

Concrete Example

Classic bank transfer (must never leave money partially moved):

BEGIN;

UPDATE account
SET balance = balance - 100
WHERE id = 1;

UPDATE account
SET balance = balance + 100
WHERE id = 2;

INSERT INTO transfer_log(from_id, to_id, amount, at)
VALUES (1, 2, 100, now());

COMMIT;

If the second UPDATE raises (say, account 2 has a CHECK violation), ROLLBACK undoes the debit. Without the transaction, account 1 lost money into the void.

Isolation in action. Two concurrent sessions in PostgreSQL:

-- Session A
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM account WHERE id = 1;  -- returns 100

-- Session B (same time)
BEGIN;
UPDATE account SET balance = 50 WHERE id = 1;
COMMIT;

-- Session A
SELECT balance FROM account WHERE id = 1;  -- still returns 100, not 50
COMMIT;

At READ COMMITTED, Session A would see 50 on the second read.

Savepoints:

BEGIN;
INSERT INTO orders(...) VALUES (...);
SAVEPOINT before_items;
INSERT INTO order_item(...) VALUES (...);
-- Some validation fails
ROLLBACK TO SAVEPOINT before_items;
INSERT INTO order_item(...) VALUES (...);  -- corrected version
COMMIT;

Common Confusion / Misconception

"ACID means everything is serial." No. ACID is about guarantees; implementations use MVCC and fine-grained locking to preserve them with high concurrency.

"READ COMMITTED is always correct." A lot of classic bugs come from "check-then-act" at READ COMMITTED: SELECT balance, then UPDATE balance = balance - 100. Between the two, another transaction may commit. Either compute in one statement (UPDATE ... WHERE balance >= 100), or use SELECT ... FOR UPDATE, or raise the isolation level.

"Long transactions are fine as long as they commit." A transaction open for hours holds snapshots and potentially locks, blocks vacuum (in PostgreSQL), and inflates table bloat. Keep transactions short and commit frequently.

How To Use It

Wrap any multi-statement change that must succeed or fail as a unit in a transaction.
Keep transactions short; do not perform external I/O (HTTP, email) inside them.
Default to READ COMMITTED; escalate to REPEATABLE READ or SERIALIZABLE only for workflows where snapshot stability or serializable guarantees are required, and be prepared to retry on 40001 serialization failures.
For hot-spot updates, prefer single-statement atomic updates (UPDATE ... WHERE cond) or explicit row locks (SELECT ... FOR UPDATE).
Log transaction boundaries in your service for post-mortem correlation.

Check Yourself

Describe a "lost update" in one sentence, and give one way to prevent it at READ COMMITTED.
What is the difference between ROLLBACK and ROLLBACK TO SAVEPOINT?
Why is it dangerous to do an external HTTP call in the middle of a BEGIN / COMMIT block?

Mini Drill or Application

Using your account table:

Write and execute the transfer transaction above in psql.
Provoke a failure (force the second UPDATE to violate a CHECK), confirm the first UPDATE was rolled back.
In two psql sessions, reproduce a non-repeatable read at READ COMMITTED, then show that REPEATABLE READ prevents it.
Use SELECT ... FOR UPDATE to eliminate a check-then-act race on a balance check.

What This Concept Is​

Why It Matters Here (connection to primary)​

Concrete Example​

Common Confusion / Misconception​

How To Use It​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​