DBMS architecture: the life of a query
You type one short SELECT and rows come back. In between, a multi-user server has to decide who runs, turn a string into a plan, find bytes on a disk, and walk the answer back up. This lesson is the map of that journey, and the map every later week points back to.
By the end you can
- Name the five components of a relational engine and say what each one does for one query.
- Trace the query processing pipeline from a SQL string down to disk and back up, naming the artifact each stage produces.
- Explain why a database is a server, not a library: admission control, workers, and the process-versus-thread fork.
- Contrast how PostgreSQL and SQLite execute the same query, and place every later week on this map.
The whole course chases one query and one transaction through the engine. Here is the query, and it will reappear in every lesson from here on:
SELECT name FROM emp WHERE salary > 75000 ORDER BY name;
This is week 1, the top of the stack. Nothing is handed to us from above: a raw SQL string arrives over a socket from a client. What this layer hands down is also a string, but now to a specific worker process or thread that has been admitted to run, with the relational query processor ready to compile it. Everything below (parsing in week 8, optimization in week 11, execution in week 9, indexes in week 6, the buffer pool in week 4, and storage in week 2) is a layer this lesson names but does not yet open. If at any point a later topic feels like an island, come back here and find it on the slab.
Think of an airport gate. A passenger asks for the flight manifest. A gate agent does not pull the wires out of the terminal; the request flows through a counter that knows who you are, a supervisor who decides whether there is staff free to handle it now, a clerk who reads the request and fetches the right records, and the answer comes back the same way it went in. A database is built the same way: layered handlers, each with one job, and the answer "unwinds the stack" back to you. The survey that grounds this lesson opens with exactly this gate-agent story.
The five components
A naive picture says the engine is one black box that reads files. That picture cannot explain how a hundred users share the machine, why a query can be told to wait, how rows get cached, or how a crash is survived. The moment you ask any of those questions you are forced to break the box into parts. Hellerstein, Stonebraker, and Hamilton break it into five [Architecture of a Database System, Fig. 1.1].
Component by component, for our SELECT:
- Client communications manager. The connection already exists; this layer remembers who you are and
over which protocol you speak, accepts the SQL text, and later ships
namevalues back. It must speak many connectivity protocols because deployments are two-tier, three-tier, or four-tier [survey §1.1]. - Process manager. It assigns a thread of computation, a DBMS worker, to your command, with a 1:1 mapping: one worker handles all the SQL from your one connection. Its sharpest decision is admission control: run now, or defer until the machine has room. More on this below.
- Relational query processor. It checks that you are authorized, compiles the SQL into a query plan, and runs that plan through a suite of operators (selection, projection, sort) plus operators that fetch records from the layer below. This is weeks 8 through 11.
- Transactional storage manager. It owns every byte of data through four sub-managers: access methods (heaps and indexes), the buffer manager, the lock manager (isolation), and the log manager (durability and rollback). This is weeks 2 through 7 plus the transaction and recovery weeks.
- Shared components and utilities. The catalog manager (which tables and columns exist, with what types) and the memory manager are consulted during almost every query: authentication, parsing, optimization. Backup, replication, and monitoring run on the side.
Two forces pull the box apart. First, a database is a server, shared by many callers, so something must manage connections (component 1) and decide who runs (component 2) before any data work begins. Second, SQL is declarative: it states what you want, not how to get it, so something must compile and run it (component 3) on top of something that safely persists and shares bytes (component 4). The catalog and memory (component 5) are pulled out because every other component needs them, so making them shared avoids threading the same dependency through all four.
Inside component 3: the query processing pipeline
Component 3 is itself a small pipeline. The survey describes it as taking "a declarative SQL statement, validates it, optimizes it into a procedural dataflow execution plan, and (subject to admission control) executes that dataflow program," after which the client pulls result tuples one at a time or in small batches [survey §4]. Each stage produces a distinct artifact, and naming the artifact is the whole trick to not getting lost.
| Stage | What it does | Artifact it hands on | Week |
|---|---|---|---|
| Parse | tokenize the string, apply grammar rules only | raw parse tree (names are still strings) | 8 |
| Bind / analyze | resolve emp and salary against the catalog, attach types, authorize | query tree (validated, typed) | 8 |
| Rewrite | expand views, fold constants, normalize predicates (query + catalog only, never data) | logical plan (relational algebra) | 8, 11 |
| Optimize | estimate selectivity, cost seq-scan versus index, pick the cheapest plan | physical plan (concrete operators) | 11 |
| Execute | run the operator tree pull-style, fetching tuples through access methods | result tuples | 9, 10 |
One subtlety worth pinning down now: parsing uses only syntactic rules and makes no catalog lookups. The reason is concrete. A catalog lookup runs inside a transaction, and pure syntax checking needs no transaction, so PostgreSQL splits "parse the string" from "bind the names" deliberately [PG parser stage]. Confusing parsing with binding is one of the most common exam traps, and week 8 returns to it.
A query travels down: string to raw parse tree to query tree to logical plan to physical plan to tuples, and then "unwinds the stack" back up. Each arrow is a layer with one job, and each layer is owned by a specific week of this course.
Simulator: trace the query stack
Reading the pipeline is one thing; watching one query crawl down it and the rows crawl back up is another. The simulator below animates the descent through the five components and the pipeline, then the unwind. Switch between PostgreSQL and SQLite to see where the two engines agree (the front: tokenize, parse, build a tree) and where they diverge (the back: PostgreSQL keeps named tree stages and a tree-walking executor; SQLite compiles to VDBE bytecode). Raise the result-set size and step into execution to watch FETCH repeat.
Why a database is a server, not a library
A SELECT against a local CSV file is a function call. A SELECT against a real engine crosses into a process that may be serving hundreds of other callers at the same instant. Two design questions fall out of that, and both shape component 2.
Workers: process per worker versus thread per worker
The worker that runs your SQL has to be realized as something the operating system schedules. There are three baseline models [survey §2.1]:
| Model | Realization | Wins | Used by |
|---|---|---|---|
| Process per worker | one OS process per connection | OS isolation and tooling, simple to build | DB2, PostgreSQL, Oracle |
| Thread per worker | one thread per connection in one process | scales to many connections, less per-worker state | SQL Server, MySQL, Informix |
| Process pool | bounded pool of reused processes | memory efficient, bounds concurrency | Oracle, DB2 (option) |
A process carries a private address space, OS resource handles, and a security context. Switching between processes means switching all of that, which is heavier than switching between threads that share one address space. So thread per worker scales to more concurrent connections. The catch is that threads share memory with no protection, so one buggy operator can corrupt another worker's state, and race conditions are harder to debug. The deeper catch undercuts the apparent isolation of process per worker too: the buffer pool and the lock table must be shared across all workers, so even the process model puts them in OS shared memory. Isolation is therefore partial in both designs.
Admission control: stopping thrashing before it starts
If every arriving query started immediately, throughput would climb to a peak and then collapse, because the working set of pages would no longer fit in the buffer pool and the system would spend its time shuffling pages to and from disk. That collapse is thrashing. Admission control prevents it by refusing or deferring work when resources are tight, so the system degrades gracefully: latency rises with arrival rate while throughput stays at its peak.
The survey describes admission control in two tiers [§2.4]. A dispatcher-level check caps the number of client connections. Then an execution admission controller inside the query processor runs after parse and optimize, because only then does the optimizer know the query's estimated memory footprint (including sort and hash space). Memory pressure in the buffer pool, driven by sorts and hash joins, is the usual cause of thrashing; lock contention is the secondary cause. This is why admission control is keyed mainly on estimated memory.
Two real engines, same front, different back
Every relational engine starts a query the same way: tokenize, parse to a tree. They diverge on how they run it. The contrast worth holding is between PostgreSQL and SQLite.
PostgreSQL's path of a query has five named stages: connection, parser, rewrite system, planner/optimizer,
executor [PG path of a query]. Its lexer
lives in scan.l (compiled by flex) and its grammar in gram.y (compiled by bison)
[PG parser stage]. SQLite instead
"translates the SQL into bytecode," which it calls analogous to converting a program into machine code, and you
can see that bytecode by typing EXPLAIN before a statement
[SQLite architecture]. Same tokenizer-then-grammar front;
a tree executor versus a bytecode VM at the back.
The slab: every week on the stack
This is the figure to keep open in another tab for the rest of the course. Each later week owns one layer of the engine. When a topic feels disconnected, find its slab here.
Two traps live in this lesson. First, do not put name and type checking in the parser: the parser is pure syntax, and existence, types, and authorization are the binding step. Second, do not say thread per worker gives perfect isolation between connections: threads share one address space with no memory protection, and even the process model shares the buffer pool and lock table. The "which is false" questions below lean on both.
The life of a query, in order
- The client communications manager accepts the SQL string on an existing connection.
- The process manager assigns a worker (1:1 with the client) and runs admission control: run now or defer.
- The query processor authorizes, parses to a raw tree (syntax only), binds against the catalog (types, existence), rewrites, optimizes to a physical plan, then executes.
- Execution drives access methods, which request pages from the buffer pool, which reads them from the storage manager on a miss.
- Tuples unwind back up: storage to buffer to access method to executor to communications manager, one FETCH batch at a time, to the client.
Deeper: why large result sets cause repeated FETCH iterations
The executor does not build the whole answer and hand it back in one shot. It produces tuples on demand, and the client pulls them "one at a time or in small batches" [survey §4]. A large result set therefore causes repeated FETCH iterations that cycle through the communications manager, the executor, and the storage manager. This is the same demand-driven, pull-based shape that the Volcano iterator model formalizes in week 9: parents pull, children produce one tuple per call. The simulator's "result rows" control models exactly this repeated unwind.
Check yourself
Deciding whether to begin a query now or defer it until memory is available belongs to which component?
Admission control is a process-manager job, though the execution-tier admission check runs after the query processor has optimized and can estimate memory footprint. It is not the storage manager's call, and the communications manager only handles connection state and shipping results.
Which statement about process-per-worker versus thread-per-worker is false?
The false claim is that process per worker fully isolates workers. The buffer pool and lock table must be shared across all workers, so they live in OS shared memory even in the process model, which undercuts the isolation. The other three statements are accurate.
In the pipeline, which stage first resolves the name emp to a real table and attaches a type to salary?
Name resolution, existence checks, type attachment, and authorization happen in binding (analysis), which runs after the parser. The parser applies only syntactic rules and makes no catalog lookups, because a catalog lookup needs a transaction. The rewriter and optimizer operate on the already-bound query tree.
Which statement about admission control and thrashing is false?
The false statement names lock contention as the only cause. The survey identifies buffer-pool memory pressure (driven by sorts and hash joins) as the usual cause of thrashing, with lock contention secondary. That is exactly why admission control keys on the optimizer's estimated memory footprint.
Comparing how PostgreSQL and SQLite execute the same SELECT, which statement is false?
SQLite does not skip parsing. It tokenizes and parses to an AST with its Lemon LALR(1) parser, then a code generator compiles the AST to VDBE bytecode. The front of both engines is identical; only the back differs (tree executor versus bytecode VM).
Paper viva: Architecture of a Database System
This week's assigned paper is Hellerstein, Stonebraker, and Hamilton's survey. Try each examiner question out loud before opening the model answer.
Name the five components of an RDBMS and say what each one does for a single query.
Model answer
The five components are the client communications manager (establishes and remembers connection state and ships data and control messages back), the process manager (assigns a thread of computation, the DBMS worker, and performs admission control), the relational query processor (checks authorization, compiles SQL into a query plan, and runs it through the plan executor), the transactional storage manager (access methods, buffer manager, lock manager for isolation, and log manager for durability and rollback), and the shared components and utilities (catalog manager and memory manager, used during essentially every query, plus replication, loaders, backup, and monitoring). They are introduced in Section 1.1 via the airport gate-agent query and shown in Figure 1.1.
What is the mapping between DBMS clients and DBMS workers, and why does it matter for the process models?
Model answer
The mapping is 1:1: one DBMS worker handles all the SQL from one DBMS client (Section 2). It matters because the three process models differ only in how that worker is realized. Process per worker maps the worker onto an OS process; thread per worker maps it onto an OS thread inside one shared process; process pool borrows a process from a bounded pool for the duration of a statement and returns it. The choice changes memory footprint, isolation, and scalability, but the 1:1 worker-to-client relationship is constant across all three.
What is admission control, why is it needed, and where in the engine does it sit?
Model answer
Admission control refuses or defers new work unless resources are available, which prevents thrashing and produces graceful degradation: as load rises, latency increases proportionally while throughput stays at its peak instead of collapsing (Section 2.4). It is needed because, left unchecked, the working set of pages eventually exceeds the buffer pool and the system spends its time shuffling pages to and from disk; sorting and hash joins are the worst offenders, and lock contention is a secondary cause. It sits in two tiers: a simple dispatcher-level check that caps the number of client connections, and an execution admission controller inside the relational query processor that runs after parse and optimize and decides whether to postpone the query, start it with fewer resources, or start it unconstrained. The second tier runs after optimization because only then is the query's estimated memory footprint, including sort and hash space, known.
The canonical end-to-end picture of how a SQL statement traverses the engine. Read Section 1 and study Figure 1.1, the five components and the gate-agent walkthrough, then skim Section 2.4 on admission control. This is the assigned paper for the week 1 viva. dsf.berkeley.edu/papers/fntdb07-architecture.pdf
Want me to redraw the five-component diagram for a write instead of a read, walk the SQLite bytecode for our SELECT, or quiz you on the pipeline artifacts until you can name them cold? Ask. I am your teacher for this course, not just a document.