Field Guide / Reference / Transactions
Transactions

Isolation levels and anomalies

A printable cheat sheet: the canonical level-by-anomaly matrix, the schedule that defines each anomaly, and the two facts that explain the table (levels are defined by phenomena, not by locks, and snapshot isolation does not fit on the ANSI ladder).

The canonical matrix

Rows are isolation levels from weakest to strongest. Columns are anomalies. A cell says whether that level allows the anomaly or prevents it. The ANSI columns are dirty read, non-repeatable read, and phantom; the extra columns add write skew, the anomaly that places snapshot isolation off the ANSI ladder. Dirty write (P0) is omitted as a column because every level forbids it.

Allowed (A) versus prevented (P) per level. ANSI levels first, then snapshot isolation as a separate row.
Isolation level Dirty read
P1
Non-repeatable read
P2
Phantom
P3
Write skew
A5B
Read uncommitted allowed allowed allowed allowed
Read committed prevented allowed allowed allowed
Repeatable read prevented prevented allowed
ANSI floor
allowed
Serializable prevented prevented prevented prevented
Snapshot isolation
not on the ANSI ladder
prevented prevented prevented allowed
the SI hole
Reading the rows. Each step down the ANSI ladder forbids one more phenomenon: read committed kills P1, repeatable read adds P2, serializable adds P3 and everything else. Snapshot isolation is not a row between two ANSI levels. It forbids the phantom that repeatable read still permits, yet it permits write skew that no serializable schedule allows, so it is incomparable with repeatable read, not above or below it.

PostgreSQL does not match this table verbatim

PostgreSQL implements three distinct levels behind the four names. Read uncommitted behaves exactly like read committed, so dirty reads never occur at any level. Its repeatable read is snapshot isolation, so it prevents phantoms despite the ANSI floor allowing them. Its serializable is serializable snapshot isolation (SSI), which catches write skew by aborting a transaction with could not serialize access due to read/write dependencies.

Anomaly definitions with a schedule

Notation follows the critique paper: w1[x] is transaction 1 writing item x, r2[x] is transaction 2 reading x, c1 is commit, a1 is abort. The subscript names the transaction. Every anomaly is an interleaved result that no serial order could produce.

Dirty write (P0)

Two transactions write the same item with one write interleaved between the other's, so a clean rollback to a consistent state becomes impossible. Forbidden at every level, which is why it is not a matrix column.

w1[x] ... w2[x] ... (both commit)   first writer cannot undo cleanly

Dirty read (P1)

T2 reads a value T1 wrote but has not committed. If T1 later aborts, T2 read a value that never officially existed.

w1[x]  r2[x]  a1        T2 saw x that T1 then rolled back

Non-repeatable read, also fuzzy read (P2)

T1 reads x, T2 modifies and commits x, T1 reads x again inside the same transaction and gets a different committed value. The changed thing is the value of a row T1 already saw.

r1[x]  w2[x]  c2  r1[x]       second r1[x] differs from first

Phantom (P3)

T1 evaluates a predicate P (for example, all employees in department 5), T2 inserts or deletes a row matching P and commits, and T1's second evaluation of P returns a different set of rows. The changed thing is which rows exist, not the value of a known row, so preventing it needs predicate, range, or gap locking, not row locks.

r1[P]  w2[y in P]  c2  r1[P]   second r1[P] returns a different row set

Lost update

Two read-modify-write cycles where one overwrite silently discards the other's committed update. Both read x = 10, both compute x + 1, both write 11; one increment vanishes. Can happen with no dirty read.

r1[x]  r2[x]  w1[x]  w2[x]     T1's update is overwritten and lost

Write skew (A5B)

Two transactions read an overlapping set, each checks a constraint that currently holds, then each writes a different item. Each write keeps the constraint true given what that transaction read, but together they break it. No write-write conflict exists to catch, yet the precedence graph has a cycle. This is the anomaly snapshot isolation permits.

r1[x,y]  r2[x,y]  w1[x]  w2[y]  c1 c2   constraint over {x,y} now violated

The on-call example

Rule: at least one doctor must stay on call. Two doctors both read "two on call, fine," each removes themselves, and now zero are on call. Each transaction was locally correct; together they are not.

Why write skew escapes snapshot isolation

Snapshot isolation resolves write-write conflicts by First-Committer-Wins, which only fires when two transactions write the same item. Write skew writes different items, so nothing fires, yet the read-write dependencies still form a cycle in the serialization graph. A cycle means no serial order exists.

T1 T2 rw on y (T1 read, T2 wrote) rw on x (T2 read, T1 wrote)
The two read-write anti-dependency edges close a cycle, so the schedule is not conflict serializable. Serializable snapshot isolation (SSI) detects exactly this dangerous structure (a transaction with both an incoming and an outgoing rw edge) and aborts a participant.

2PL versus MVCC, one line each

ApproachOne-liner
2PL Force serializability by blocking: acquire all locks (growing phase) before releasing any (shrinking phase); strict 2PL holds write locks to commit, so readers and writers wait on each other.
MVCC Avoid the waiting by keeping old versions: each transaction reads a consistent snapshot, so reads never block writes and writes never block reads; writers are still serialized by some protocol on top.

The ANSI critique note

Berenson, Bernstein, Gray, Melton, O'Neil, O'Neil, SIGMOD 1995

The critique paper argues the ANSI anomaly-based definitions are underspecified. Its main points:

The one thing to remember

A phantom is a changed row set (insert or delete under a predicate); a non-repeatable read is a changed value of a row already seen. Repeatable read forbids the second but allows the first, which is exactly why predicate or range locking, not row locking, is needed for serializable.