B+tree operations cheat sheet · Field Guide to the Database Engine

Invariants (order m)

What a B+tree always guarantees between operations.
Invariant	Statement
Data in leaves	All record entries live in leaves. Internal nodes hold only separator keys and child pointers; they route, never store payload. This is the line between a B+tree and a plain B-tree.
Linked leaves	Leaves form a (doubly) linked list in key order, so a range scan walks leaf to leaf without going back up. PostgreSQL keeps a doubly linked leaf chain.
Height balanced	Every root-to-leaf path has the same length. Balance is structural, not a periodic rebuild. Tree grows or shrinks only at the root.
Half full (non-root)	Every non-root node holds between ceil(m/2) - 1 and m - 1 keys (internal: between ceil(m/2) and m children).
Root exemption	The root may hold as few as 1 key (2 children), or 0 keys when the tree is a single leaf. The half-full rule does not apply to the root.
Separator order	Between adjacent separators, a search key v follows Ki < v <= Ki+1 (the exact nbtree inequality).

Why half full

Forcing at least 50 percent occupancy on non-root nodes bounds the height at about log_(m/2) N, so the logarithmic I/O cost holds in the worst case, not just on average. Without it the tree could degenerate.

Search

Start at the root. Binary-search the separators to pick the child whose range contains v, using Ki < v <= Ki+1.
Descend one page read per level until a leaf, then binary-search the leaf.
Range scan: descend once to the lower bound, then follow leaf links rightward to the upper bound.

Cost: one page read per level, O(log_m N) I/Os. Upper levels usually stay cached in the buffer pool, so marginal cost is closer to one or two leaf reads.

Insert (split rule)

Descend to the target leaf as in search.
If the leaf has room, insert in sorted position. Done, no structural change.
If the leaf is full, split: allocate a new leaf, move the upper half of entries into it, link it into the leaf chain.
Leaf split = copy up. The first key of the new right leaf is copied up as the parent separator; the key still exists in the leaf, because all data stays in leaves.
Internal split = push up. If inserting the separator overflows an internal node, split it and move (push) the middle key up to the grandparent; internal nodes only route, so no copy is kept.
Splits propagate upward only while overflow continues. If the root splits, allocate a new root over the two halves. This is the only way height grows, and it grows from the top, keeping all leaves at equal depth.

Delete (underflow)

Descend to the leaf and remove the entry.
If the node still has >= ceil(m/2) - 1 keys, done.
If it underflows (< ceil(m/2) - 1 keys), restore the invariant using an adjacent sibling.
Redistribute (borrow / rotate): if a sibling has more than the minimum, move one entry across and update the parent separator. Tree shape unchanged.
Merge (coalesce): if no sibling can spare an entry, combine the node with a sibling and remove the now-unused separator from the parent.
A merge removes a parent key, which can underflow the parent, so merge or redistribute propagates upward like splits. If the root ends with one child, that child becomes the new root and the tree shrinks in height.

Underflow decision: redistribute versus merge.
Condition after delete	Action	Effect on tree
Node has >= ceil(m/2) - 1 keys	Nothing	No change
Underflow, adjacent sibling has > minimum keys	Redistribute / borrow one entry, update parent separator	Shape unchanged, no propagation
Underflow, no sibling above minimum	Merge with sibling, drop separator from parent	May cascade upward; root may collapse and shrink height

Exam traps

Leaf split copies up, internal split pushes up; do not swap them. The root is exempt from half-full, so "every node is at least half full" is false. Height changes only on a root split (taller) or a root collapse after merge (shorter), never on an ordinary insert.

Leaf split, before and after

Figure 1. The leaf overflows, the upper half moves to a new right leaf, and its first key (17) is copied up into the parent. The data 17 still lives in the leaf. An internal split would instead push the middle key up and not keep it below.

Bulk loading (bottom up)

One-at-a-time insertion of N keys costs N descents and many random splits. If keys are sorted (or can be sorted first), build the tree bottom up instead.

Sort the entries.
Pack them into leaves left to right up to a target fill (the fillfactor, leaving slack for later inserts).
Link the leaves into the chain.
Build the parent level from the first key of each leaf; repeat upward until one root remains.

Result: sequential I/O, controlled page density, and a denser, shallower tree than repeated insertion. PostgreSQL B-tree fillfactor defaults to 90 (range 10 to 100).

Complexity

Page-read (I/O) cost in a B+tree of order m over N keys.
Operation	I/O cost	Note
Point search (=)	O(log_m N)	One read per level; upper levels usually cached
Range scan (k matches)	O(log_m N + k/leaf)	One descent, then sequential leaf links
Insert	O(log_m N)	Dominated by descent; splits rare amortized (halves stay half full)
Delete	O(log_m N)	Dominated by descent; merge or redistribute may cascade up
Bulk load N sorted keys	O(N) sequential	Plus the sort if not already ordered
Height	about log_m N	Fanout 100: 3 levels reach ~1M leaf entries, 4 levels ~100M

Keep this one thing

The log base is the fanout m, not 2. Wide nodes make a single page read a high-fanout decision, so the tree is 3 to 4 levels deep for hundreds of millions of rows. "O(log2 N) lookups" is the trap.

B-tree vs B+tree vs hash

When each access method wins. B+tree is PostgreSQL nbtree and SQLite table/index b-trees.
Property	Plain B-tree	B+tree	Hash index
Where data / pointers live	Internal nodes and leaves	Leaves only; internals route	Buckets keyed by hash
Point lookup (=)	O(log_m N) I/O	O(log_m N) I/O	expected O(1) I/O
Range / inequality (<, >, BETWEEN)	possible but no linked leaves	yes, ordered linked leaves	no
ORDER BY without a sort	in-order traversal, not leaf-linked	yes, walk the leaf chain	no
Prefix of composite key	yes	yes (leftmost prefix)	no, needs the full key
Worst case under skew / growth	bounded by height	bounded by height	long overflow chains

In real systems

PostgreSQL nbtree is a B+tree on the Lehman-Yao design (right-links and high keys for lock-light search), with lazy two-stage deletion rather than eager textbook merge on every underflow. SQLite stores tables and indexes as b-trees with data only in leaves. PostgreSQL hash indexes are single column, equality only, no uniqueness, and crash-safe since v10 [PG hash docs].