ARIES recovery on one page
The write-ahead log, the three bookkeeping structures, the three restart passes, and why steal and no-force force undo and redo. Mohan et al., ACM TODS 17(1), 1992.
The WAL rule
In-place updating: a changed page is written back to the same disk location it came from, so the log is the only source of truth for recovery (not a shadow copy). Two obligations make this safe:
- Undo before steal. Before a dirty page is written to the durable database, the log records describing its changes (at least the undo portions) must already be on stable storage. The buffer manager forces the log up to that page's
pageLSNbefore stealing it. - Redo before commit. A transaction is not committed until its commit record and all earlier log up to that LSN are forced to stable storage.
Log first, page later. Rule 1 guarantees you can always undo an uncommitted change that leaked to disk; rule 2 guarantees a committed change can always be redone. Everything else in ARIES is bookkeeping that makes those two replays exact.
The fields that make replay exact
| Name | Lives on | What it is |
|---|---|---|
LSN | every log record | Log sequence number. Monotonically increasing identifier of a log record. |
pageLSN | every data page | LSN of the log record for the most recent update applied to that page. Lets redo test whether a logged update is already present. |
prevLSN | every log record | LSN of the previous record written by the same transaction. Chains a transaction's records into a backward list (first record holds 0). |
recLSN | Dirty Page Table entry | Recovery LSN. End-of-log LSN at the moment the page was first dirtied; the earliest log point whose updates may not yet be on disk. |
undoNextLSN | CLRs only | Points to the predecessor of the record this CLR just compensated, that is, the next record of that transaction still to be undone. |
The two in-memory tables
| Field | Meaning |
|---|---|
TransID | transaction id |
State | P prepared / in-doubt, U unprepared / undoable, committed |
LastLSN | LSN of its latest log record |
UndoNxtLSN | next record to undo: from the last CLR's undoNextLSN if that record is a CLR, else equals LastLSN |
| Field | Meaning |
|---|---|
PageID | the dirty page |
recLSN | set to end-of-log LSN when a clean page is first fixed to modify; entry removed when the page is written to disk |
Both tables are written into checkpoint records, initialized from the latest checkpoint at restart, and brought up to date by the Analysis pass.
Compensation Log Records
When ARIES undoes an update it logs the undo as a CLR. CLRs are redo-only: they are never themselves undone. The CLR's undoNextLSN records how far rollback has progressed, so a failure or nested rollback during undo resumes from that point and skips already-undone work. This bounds log volume under repeated or nested rollbacks. Because the CLR describes what the undo actually did, undo need not be the physical inverse on the same page, which enables logical undo and high-concurrency operations.
CLRs are never undone (they have no undo); only regular update records are. IMS undid the same non-CLR repeatedly and DB2 undid CLRs too. ARIES does neither, which is the whole point of undoNextLSN.
The three restart passes
- Analysis. Scan forward from the most recent checkpoint to the end of the log. Builds: the Transaction Table and Dirty Page Table brought up to end of log, the set of loser transactions to undo, and
RedoLSN = min(recLSN)over all Dirty Page Table entries, where the redo pass must start. - Redo (repeating history). Scan forward from
RedoLSNto end of log. For each redoable update or CLR whosePageIDis in the Dirty Page Table and whoseLSN ≥ page recLSN, fix and latch the page. Conditional: reapply only ifpageLSN < LSN, then setpageLSN = LSN; otherwise skip (already on disk). Reapplies updates of all transactions, including losers, to rebuild the exact failure-time state. No logging during redo. Also reacquires locks for in-doubt (prepared) distributed transactions. - Undo of losers. Roll back all losers in reverse chronological order in a single sweep, repeatedly taking the max of the next-to-be-undone LSNs across not-yet-finished losers. Unconditional: no
pageLSNcomparison, because repeating history guarantees every loser update is present. Each undone update writes a CLR; on hitting a CLR, jump to itsundoNextLSNto skip already-undone work.
Redoing even the losers reestablishes the exact state at the moment of the crash. Only then are losers undone. This is what guarantees ARIES never undoes an effect that is absent and never undoes-then-redoes a committed change already on disk, the two bugs it criticizes in older WAL methods. Losers = transactions neither committed nor in the in-doubt (prepared) state at failure.
RedoLSN. Redo replays forward from RedoLSN = min(recLSN), which can sit before the checkpoint. Undo then sweeps backward, compensating losers with CLRs. Redo is conditional on pageLSN; undo is unconditional.Steal and no-force: what each needs
The buffer policy decides which replays are possible, so it decides which recovery work a crash can demand. ARIES supports the hard corner, steal + no-force, which is why it needs both undo and redo.
| No-steal dirty page of an uncommitted txn never written | Steal uncommitted dirty page may reach disk | |
|---|---|---|
| Force committed txn's pages flushed at commit |
No undo, no redo. Simplest, but force is slow (random I/O at commit) and no-steal needs a huge buffer. | Undo only. Stolen uncommitted changes must be rolled back, but committed work is already on disk. |
| No-force commit need not flush pages |
Redo only. Committed changes may be missing from disk and must be replayed. | Undo + redo. The ARIES corner. Fast at runtime, hardest at restart. |
Steal -> undo: an uncommitted transaction's dirty page can land on disk before commit, so a crash leaves uncommitted changes that must be undone. No-force -> redo: a committed transaction's dirty pages may still be in the buffer at commit, so a crash can lose them and they must be redone. ARIES imposes no page-replacement restriction beyond enforcing WAL.
Forward, from RedoLSN. Conditional: apply only if pageLSN < LSN. All transactions. No logging.
Backward, losers only. Unconditional: no pageLSN test. Writes a CLR per undo; follows undoNextLSN to skip done work.
Section 4 for the data structures (pageLSN, Transaction Table, Dirty Page Table, recLSN, CLR fields), Section 6 for the three passes. Section 3 for repeating history and CLRs.