Isolation levels and anomalies
A printable cheat sheet: the canonical level-by-anomaly matrix, the schedule that defines each anomaly, and the two facts that explain the table (levels are defined by phenomena, not by locks, and snapshot isolation does not fit on the ANSI ladder).
The canonical matrix
Rows are isolation levels from weakest to strongest. Columns are anomalies. A cell says whether that level allows the anomaly or prevents it. The ANSI columns are dirty read, non-repeatable read, and phantom; the extra columns add write skew, the anomaly that places snapshot isolation off the ANSI ladder. Dirty write (P0) is omitted as a column because every level forbids it.
| Isolation level | Dirty read P1 |
Non-repeatable read P2 |
Phantom P3 |
Write skew A5B |
|---|---|---|---|---|
| Read uncommitted | allowed | allowed | allowed | allowed |
| Read committed | prevented | allowed | allowed | allowed |
| Repeatable read | prevented | prevented | allowed ANSI floor |
allowed |
| Serializable | prevented | prevented | prevented | prevented |
| Snapshot isolation not on the ANSI ladder |
prevented | prevented | prevented | allowed the SI hole |
PostgreSQL does not match this table verbatim
PostgreSQL implements three distinct levels behind the four names. Read uncommitted behaves exactly like
read committed, so dirty reads never occur at any level. Its repeatable read is snapshot isolation, so it
prevents phantoms despite the ANSI floor allowing them. Its serializable is serializable snapshot isolation
(SSI), which catches write skew by aborting a transaction with could not serialize access due to
read/write dependencies.
Anomaly definitions with a schedule
Notation follows the critique paper: w1[x] is transaction 1 writing item x, r2[x]
is transaction 2 reading x, c1 is commit, a1 is abort. The subscript names the
transaction. Every anomaly is an interleaved result that no serial order could produce.
Dirty write (P0)
Two transactions write the same item with one write interleaved between the other's, so a clean rollback to a consistent state becomes impossible. Forbidden at every level, which is why it is not a matrix column.
w1[x] ... w2[x] ... (both commit) first writer cannot undo cleanly
Dirty read (P1)
T2 reads a value T1 wrote but has not committed. If T1 later aborts, T2 read a value that never officially existed.
w1[x] r2[x] a1 T2 saw x that T1 then rolled back
Non-repeatable read, also fuzzy read (P2)
T1 reads x, T2 modifies and commits x, T1 reads x again inside the same transaction and gets a different committed value. The changed thing is the value of a row T1 already saw.
r1[x] w2[x] c2 r1[x] second r1[x] differs from first
Phantom (P3)
T1 evaluates a predicate P (for example, all employees in department 5), T2 inserts or deletes a row matching P and commits, and T1's second evaluation of P returns a different set of rows. The changed thing is which rows exist, not the value of a known row, so preventing it needs predicate, range, or gap locking, not row locks.
r1[P] w2[y in P] c2 r1[P] second r1[P] returns a different row set
Lost update
Two read-modify-write cycles where one overwrite silently discards the other's committed update. Both read x = 10, both compute x + 1, both write 11; one increment vanishes. Can happen with no dirty read.
r1[x] r2[x] w1[x] w2[x] T1's update is overwritten and lost
Write skew (A5B)
Two transactions read an overlapping set, each checks a constraint that currently holds, then each writes a different item. Each write keeps the constraint true given what that transaction read, but together they break it. No write-write conflict exists to catch, yet the precedence graph has a cycle. This is the anomaly snapshot isolation permits.
r1[x,y] r2[x,y] w1[x] w2[y] c1 c2 constraint over {x,y} now violated
The on-call example
Rule: at least one doctor must stay on call. Two doctors both read "two on call, fine," each removes themselves, and now zero are on call. Each transaction was locally correct; together they are not.
Why write skew escapes snapshot isolation
Snapshot isolation resolves write-write conflicts by First-Committer-Wins, which only fires when two transactions write the same item. Write skew writes different items, so nothing fires, yet the read-write dependencies still form a cycle in the serialization graph. A cycle means no serial order exists.
2PL versus MVCC, one line each
| Approach | One-liner |
|---|---|
| 2PL | Force serializability by blocking: acquire all locks (growing phase) before releasing any (shrinking phase); strict 2PL holds write locks to commit, so readers and writers wait on each other. |
| MVCC | Avoid the waiting by keeping old versions: each transaction reads a consistent snapshot, so reads never block writes and writes never block reads; writers are still serialized by some protocol on top. |
The ANSI critique note
Berenson, Bernstein, Gray, Melton, O'Neil, O'Neil, SIGMOD 1995
The critique paper argues the ANSI anomaly-based definitions are underspecified. Its main points:
- Levels are defined by phenomena, not by locking. ANSI names which of P1, P2, P3 each level must forbid; it does not mandate an implementation. A common MCQ trap states the reverse.
- P0 dirty write was missing. The paper adds it because the loosest level must still forbid overlapping writes, otherwise rollback is ill defined.
- The phenomena are ambiguous. They admit a loose (strict) and a broad reading, which is why the English definitions fail to pin down the intended locking levels.
- Snapshot isolation breaks the ladder. SI forbids P1, P2, and the simple phantom, so it sits above repeatable read on those, yet it permits write skew (A5B) and a read-only anomaly, so it is weaker than serializable. SI is therefore incomparable with repeatable read, the headline result.
The one thing to remember
A phantom is a changed row set (insert or delete under a predicate); a non-repeatable read is a changed value of a row already seen. Repeatable read forbids the second but allows the first, which is exactly why predicate or range locking, not row locking, is needed for serializable.