Field Guide / Part I · Foundations / Week 01
Query layer

DBMS architecture: the life of a query

You type one short SELECT and rows come back. In between, a multi-user server has to decide who runs, turn a string into a plan, find bytes on a disk, and walk the answer back up. This lesson is the map of that journey, and the map every later week points back to.

By the end you can

  • Name the five components of a relational engine and say what each one does for one query.
  • Trace the query processing pipeline from a SQL string down to disk and back up, naming the artifact each stage produces.
  • Explain why a database is a server, not a library: admission control, workers, and the process-versus-thread fork.
  • Contrast how PostgreSQL and SQLite execute the same query, and place every later week on this map.

The whole course chases one query and one transaction through the engine. Here is the query, and it will reappear in every lesson from here on:

SELECT name FROM emp WHERE salary > 75000 ORDER BY name;

This is week 1, the top of the stack. Nothing is handed to us from above: a raw SQL string arrives over a socket from a client. What this layer hands down is also a string, but now to a specific worker process or thread that has been admitted to run, with the relational query processor ready to compile it. Everything below (parsing in week 8, optimization in week 11, execution in week 9, indexes in week 6, the buffer pool in week 4, and storage in week 2) is a layer this lesson names but does not yet open. If at any point a later topic feels like an island, come back here and find it on the slab.

Intuition

Think of an airport gate. A passenger asks for the flight manifest. A gate agent does not pull the wires out of the terminal; the request flows through a counter that knows who you are, a supervisor who decides whether there is staff free to handle it now, a clerk who reads the request and fetches the right records, and the answer comes back the same way it went in. A database is built the same way: layered handlers, each with one job, and the answer "unwinds the stack" back to you. The survey that grounds this lesson opens with exactly this gate-agent story.

The five components

A naive picture says the engine is one black box that reads files. That picture cannot explain how a hundred users share the machine, why a query can be told to wait, how rows get cached, or how a crash is survived. The moment you ask any of those questions you are forced to break the box into parts. Hellerstein, Stonebraker, and Hamilton break it into five [Architecture of a Database System, Fig. 1.1].

Client (JDBC/ODBC) sends SQL string 1. Client communications manager keeps connection state, ships results back 2. Process manager assigns a worker, runs admission control 3. Relational query processor authorize, parse, rewrite, optimize, execute (weeks 8, 11, 9, 10) 4. Transactional storage manager access methods (w6,7) buffer manager (w4,5) lock manager (w12,13) log manager (w14) 5. Shared: catalog + memory used by almost every query
Figure 1. The five components, redrawn from the survey's Figure 1.1. A query enters at the client, descends 1 to 4, and consults the shared catalog and memory (5) along the way. The labels in the storage box name the weeks that open each sub-component, so you can already see where the course is going.

Component by component, for our SELECT:

  1. Client communications manager. The connection already exists; this layer remembers who you are and over which protocol you speak, accepts the SQL text, and later ships name values back. It must speak many connectivity protocols because deployments are two-tier, three-tier, or four-tier [survey §1.1].
  2. Process manager. It assigns a thread of computation, a DBMS worker, to your command, with a 1:1 mapping: one worker handles all the SQL from your one connection. Its sharpest decision is admission control: run now, or defer until the machine has room. More on this below.
  3. Relational query processor. It checks that you are authorized, compiles the SQL into a query plan, and runs that plan through a suite of operators (selection, projection, sort) plus operators that fetch records from the layer below. This is weeks 8 through 11.
  4. Transactional storage manager. It owns every byte of data through four sub-managers: access methods (heaps and indexes), the buffer manager, the lock manager (isolation), and the log manager (durability and rollback). This is weeks 2 through 7 plus the transaction and recovery weeks.
  5. Shared components and utilities. The catalog manager (which tables and columns exist, with what types) and the memory manager are consulted during almost every query: authentication, parsing, optimization. Backup, replication, and monitoring run on the side.
Why split it into exactly these five

Two forces pull the box apart. First, a database is a server, shared by many callers, so something must manage connections (component 1) and decide who runs (component 2) before any data work begins. Second, SQL is declarative: it states what you want, not how to get it, so something must compile and run it (component 3) on top of something that safely persists and shares bytes (component 4). The catalog and memory (component 5) are pulled out because every other component needs them, so making them shared avoids threading the same dependency through all four.

Inside component 3: the query processing pipeline

Component 3 is itself a small pipeline. The survey describes it as taking "a declarative SQL statement, validates it, optimizes it into a procedural dataflow execution plan, and (subject to admission control) executes that dataflow program," after which the client pulls result tuples one at a time or in small batches [survey §4]. Each stage produces a distinct artifact, and naming the artifact is the whole trick to not getting lost.

The pipeline for SELECT name FROM emp WHERE salary > 75000 ORDER BY name, stage by artifact.
StageWhat it doesArtifact it hands onWeek
Parsetokenize the string, apply grammar rules onlyraw parse tree (names are still strings)8
Bind / analyzeresolve emp and salary against the catalog, attach types, authorizequery tree (validated, typed)8
Rewriteexpand views, fold constants, normalize predicates (query + catalog only, never data)logical plan (relational algebra)8, 11
Optimizeestimate selectivity, cost seq-scan versus index, pick the cheapest planphysical plan (concrete operators)11
Executerun the operator tree pull-style, fetching tuples through access methodsresult tuples9, 10

One subtlety worth pinning down now: parsing uses only syntactic rules and makes no catalog lookups. The reason is concrete. A catalog lookup runs inside a transaction, and pure syntax checking needs no transaction, so PostgreSQL splits "parse the string" from "bind the names" deliberately [PG parser stage]. Confusing parsing with binding is one of the most common exam traps, and week 8 returns to it.

Keep this one thing

A query travels down: string to raw parse tree to query tree to logical plan to physical plan to tuples, and then "unwinds the stack" back up. Each arrow is a layer with one job, and each layer is owned by a specific week of this course.

Simulator: trace the query stack

Reading the pipeline is one thing; watching one query crawl down it and the rows crawl back up is another. The simulator below animates the descent through the five components and the pipeline, then the unwind. Switch between PostgreSQL and SQLite to see where the two engines agree (the front: tokenize, parse, build a tree) and where they diverge (the back: PostgreSQL keeps named tree stages and a tree-walking executor; SQLite compiles to VDBE bytecode). Raise the result-set size and step into execution to watch FETCH repeat.

Query stack tracer step down, then watch it unwind
Experiment: set engine to SQLite and step past the optimizer. Notice the named "logical plan" and "physical plan" stages of PostgreSQL collapse into one "compile to VDBE bytecode" stage. Then raise result rows and watch the execute stage repeat its FETCH.

Why a database is a server, not a library

A SELECT against a local CSV file is a function call. A SELECT against a real engine crosses into a process that may be serving hundreds of other callers at the same instant. Two design questions fall out of that, and both shape component 2.

Workers: process per worker versus thread per worker

The worker that runs your SQL has to be realized as something the operating system schedules. There are three baseline models [survey §2.1]:

How a DBMS worker is realized, and what each choice trades.
ModelRealizationWinsUsed by
Process per workerone OS process per connectionOS isolation and tooling, simple to buildDB2, PostgreSQL, Oracle
Thread per workerone thread per connection in one processscales to many connections, less per-worker stateSQL Server, MySQL, Informix
Process poolbounded pool of reused processesmemory efficient, bounds concurrencyOracle, DB2 (option)
Why thread-per-worker scales better, and the catch

A process carries a private address space, OS resource handles, and a security context. Switching between processes means switching all of that, which is heavier than switching between threads that share one address space. So thread per worker scales to more concurrent connections. The catch is that threads share memory with no protection, so one buggy operator can corrupt another worker's state, and race conditions are harder to debug. The deeper catch undercuts the apparent isolation of process per worker too: the buffer pool and the lock table must be shared across all workers, so even the process model puts them in OS shared memory. Isolation is therefore partial in both designs.

Admission control: stopping thrashing before it starts

If every arriving query started immediately, throughput would climb to a peak and then collapse, because the working set of pages would no longer fit in the buffer pool and the system would spend its time shuffling pages to and from disk. That collapse is thrashing. Admission control prevents it by refusing or deferring work when resources are tight, so the system degrades gracefully: latency rises with arrival rate while throughput stays at its peak.

In real systems

The survey describes admission control in two tiers [§2.4]. A dispatcher-level check caps the number of client connections. Then an execution admission controller inside the query processor runs after parse and optimize, because only then does the optimizer know the query's estimated memory footprint (including sort and hash space). Memory pressure in the buffer pool, driven by sorts and hash joins, is the usual cause of thrashing; lock contention is the secondary cause. This is why admission control is keyed mainly on estimated memory.

offered load (arrival rate) throughput no control: thrashes with control: graceful plateau
Figure 2. Without admission control, throughput peaks then collapses as the buffer pool overflows. With it, throughput holds at the peak and only latency degrades. This is the curve component 2 is built to flatten.

Two real engines, same front, different back

Every relational engine starts a query the same way: tokenize, parse to a tree. They diverge on how they run it. The contrast worth holding is between PostgreSQL and SQLite.

PostgreSQL SQLite string -> raw parse tree (flex + bison) analyze -> Query tree (typed) rewrite -> Query tree (views expanded) plan -> Plan tree, tree-walking executor string -> AST (tokenizer + Lemon LALR) code generator: one pass does bind, rewrite, optimize, emit bytecode VDBE bytecode, register VM runs it
Figure 3. Both engines tokenize and parse to a tree. PostgreSQL keeps explicit, separately named stages (raw parse tree, Query tree, Plan tree) and a tree-walking iterator executor. SQLite collapses analysis through code generation into one pass and runs flat VDBE bytecode on a register virtual machine.
In real systems

PostgreSQL's path of a query has five named stages: connection, parser, rewrite system, planner/optimizer, executor [PG path of a query]. Its lexer lives in scan.l (compiled by flex) and its grammar in gram.y (compiled by bison) [PG parser stage]. SQLite instead "translates the SQL into bytecode," which it calls analogous to converting a program into machine code, and you can see that bytecode by typing EXPLAIN before a statement [SQLite architecture]. Same tokenizer-then-grammar front; a tree executor versus a bytecode VM at the back.

The slab: every week on the stack

This is the figure to keep open in another tab for the rest of the course. Each later week owns one layer of the engine. When a topic feels disconnected, find its slab here.

Query layer w1, 8, 9, 10, 11 Index structures w6, 7 Buffer pool w4, 5 Storage engine w2, 3 Transactions and concurrency w12, 13 Logging and recovery w14 Modern and distributed (w15, 16) recompose every layer above.
Figure 4. The course walks the stack twice: a SELECT down the read path (weeks 1, 8, 11, 9, 10, 6, 7, 4, 5, 2, 3) and an UPDATE down the write path (weeks 12, 13, 14, back through 4, 5, 2, 3). Weeks 15 and 16 do not add a layer; they rebuild these layers under new pressure.
Exam trap

Two traps live in this lesson. First, do not put name and type checking in the parser: the parser is pure syntax, and existence, types, and authorization are the binding step. Second, do not say thread per worker gives perfect isolation between connections: threads share one address space with no memory protection, and even the process model shares the buffer pool and lock table. The "which is false" questions below lean on both.

The life of a query, in order

  1. The client communications manager accepts the SQL string on an existing connection.
  2. The process manager assigns a worker (1:1 with the client) and runs admission control: run now or defer.
  3. The query processor authorizes, parses to a raw tree (syntax only), binds against the catalog (types, existence), rewrites, optimizes to a physical plan, then executes.
  4. Execution drives access methods, which request pages from the buffer pool, which reads them from the storage manager on a miss.
  5. Tuples unwind back up: storage to buffer to access method to executor to communications manager, one FETCH batch at a time, to the client.
Deeper: why large result sets cause repeated FETCH iterations

The executor does not build the whole answer and hand it back in one shot. It produces tuples on demand, and the client pulls them "one at a time or in small batches" [survey §4]. A large result set therefore causes repeated FETCH iterations that cycle through the communications manager, the executor, and the storage manager. This is the same demand-driven, pull-based shape that the Volcano iterator model formalizes in week 9: parents pull, children produce one tuple per call. The simulator's "result rows" control models exactly this repeated unwind.

Check yourself

Deciding whether to begin a query now or defer it until memory is available belongs to which component?

Admission control is a process-manager job, though the execution-tier admission check runs after the query processor has optimized and can estimate memory footprint. It is not the storage manager's call, and the communications manager only handles connection state and shipping results.

Which statement about process-per-worker versus thread-per-worker is false?

The false claim is that process per worker fully isolates workers. The buffer pool and lock table must be shared across all workers, so they live in OS shared memory even in the process model, which undercuts the isolation. The other three statements are accurate.

In the pipeline, which stage first resolves the name emp to a real table and attaches a type to salary?

Name resolution, existence checks, type attachment, and authorization happen in binding (analysis), which runs after the parser. The parser applies only syntactic rules and makes no catalog lookups, because a catalog lookup needs a transaction. The rewriter and optimizer operate on the already-bound query tree.

Which statement about admission control and thrashing is false?

The false statement names lock contention as the only cause. The survey identifies buffer-pool memory pressure (driven by sorts and hash joins) as the usual cause of thrashing, with lock contention secondary. That is exactly why admission control keys on the optimizer's estimated memory footprint.

Comparing how PostgreSQL and SQLite execute the same SELECT, which statement is false?

SQLite does not skip parsing. It tokenizes and parses to an AST with its Lemon LALR(1) parser, then a code generator compiles the AST to VDBE bytecode. The front of both engines is identical; only the back differs (tree executor versus bytecode VM).

Name the five components of a relational engine and the one job of each.
Client communications manager (connection state, ship results), process manager (assign worker, admission control), relational query processor (authorize, compile, execute), transactional storage manager (access methods, buffer, lock, log), shared catalog and memory (consulted by almost every query).
In order, what artifact does each pipeline stage hand on?
Parse -> raw parse tree, bind -> typed query tree, rewrite -> logical plan (relational algebra), optimize -> physical plan (concrete operators), execute -> result tuples. Then the stack unwinds back to the client.

Paper viva: Architecture of a Database System

This week's assigned paper is Hellerstein, Stonebraker, and Hamilton's survey. Try each examiner question out loud before opening the model answer.

Viva prompt 1

Name the five components of an RDBMS and say what each one does for a single query.

Model answer

The five components are the client communications manager (establishes and remembers connection state and ships data and control messages back), the process manager (assigns a thread of computation, the DBMS worker, and performs admission control), the relational query processor (checks authorization, compiles SQL into a query plan, and runs it through the plan executor), the transactional storage manager (access methods, buffer manager, lock manager for isolation, and log manager for durability and rollback), and the shared components and utilities (catalog manager and memory manager, used during essentially every query, plus replication, loaders, backup, and monitoring). They are introduced in Section 1.1 via the airport gate-agent query and shown in Figure 1.1.

Viva prompt 2

What is the mapping between DBMS clients and DBMS workers, and why does it matter for the process models?

Model answer

The mapping is 1:1: one DBMS worker handles all the SQL from one DBMS client (Section 2). It matters because the three process models differ only in how that worker is realized. Process per worker maps the worker onto an OS process; thread per worker maps it onto an OS thread inside one shared process; process pool borrows a process from a bounded pool for the duration of a statement and returns it. The choice changes memory footprint, isolation, and scalability, but the 1:1 worker-to-client relationship is constant across all three.

Viva prompt 3

What is admission control, why is it needed, and where in the engine does it sit?

Model answer

Admission control refuses or defers new work unless resources are available, which prevents thrashing and produces graceful degradation: as load rises, latency increases proportionally while throughput stays at its peak instead of collapsing (Section 2.4). It is needed because, left unchecked, the working set of pages eventually exceeds the buffer pool and the system spends its time shuffling pages to and from disk; sorting and hash joins are the worst offenders, and lock contention is a secondary cause. It sits in two tiers: a simple dispatcher-level check that caps the number of client connections, and an execution admission controller inside the relational query processor that runs after parse and optimize and decides whether to postpone the query, start it with fewer resources, or start it unconstrained. The second tier runs after optimization because only then is the query's estimated memory footprint, including sort and hash space, known.

Primary source
Architecture of a Database System (Hellerstein, Stonebraker, Hamilton, 2007)

The canonical end-to-end picture of how a SQL statement traverses the engine. Read Section 1 and study Figure 1.1, the five components and the gate-agent walkthrough, then skim Section 2.4 on admission control. This is the assigned paper for the week 1 viva. dsf.berkeley.edu/papers/fntdb07-architecture.pdf

Ask your teacher

Want me to redraw the five-component diagram for a write instead of a read, walk the SQLite bytecode for our SELECT, or quiz you on the pipeline artifacts until you can name them cold? Ask. I am your teacher for this course, not just a document.