Field Guide / Reference / B+tree operations

B+tree operations cheat sheet

Order m. Search, insert with splits, delete with merge or redistribute, bulk loading, and the rules that decide each. Everything is keyed to the half-full invariant.

Invariants (order m)

What a B+tree always guarantees between operations.
InvariantStatement
Data in leavesAll record entries live in leaves. Internal nodes hold only separator keys and child pointers; they route, never store payload. This is the line between a B+tree and a plain B-tree.
Linked leavesLeaves form a (doubly) linked list in key order, so a range scan walks leaf to leaf without going back up. PostgreSQL keeps a doubly linked leaf chain.
Height balancedEvery root-to-leaf path has the same length. Balance is structural, not a periodic rebuild. Tree grows or shrinks only at the root.
Half full (non-root)Every non-root node holds between ceil(m/2) - 1 and m - 1 keys (internal: between ceil(m/2) and m children).
Root exemptionThe root may hold as few as 1 key (2 children), or 0 keys when the tree is a single leaf. The half-full rule does not apply to the root.
Separator orderBetween adjacent separators, a search key v follows Ki < v <= Ki+1 (the exact nbtree inequality).
Why half full

Forcing at least 50 percent occupancy on non-root nodes bounds the height at about log_(m/2) N, so the logarithmic I/O cost holds in the worst case, not just on average. Without it the tree could degenerate.

Search

  1. Start at the root. Binary-search the separators to pick the child whose range contains v, using Ki < v <= Ki+1.
  2. Descend one page read per level until a leaf, then binary-search the leaf.
  3. Range scan: descend once to the lower bound, then follow leaf links rightward to the upper bound.

Cost: one page read per level, O(log_m N) I/Os. Upper levels usually stay cached in the buffer pool, so marginal cost is closer to one or two leaf reads.

Insert (split rule)

  1. Descend to the target leaf as in search.
  2. If the leaf has room, insert in sorted position. Done, no structural change.
  3. If the leaf is full, split: allocate a new leaf, move the upper half of entries into it, link it into the leaf chain.
  4. Leaf split = copy up. The first key of the new right leaf is copied up as the parent separator; the key still exists in the leaf, because all data stays in leaves.
  5. Internal split = push up. If inserting the separator overflows an internal node, split it and move (push) the middle key up to the grandparent; internal nodes only route, so no copy is kept.
  6. Splits propagate upward only while overflow continues. If the root splits, allocate a new root over the two halves. This is the only way height grows, and it grows from the top, keeping all leaves at equal depth.

Delete (underflow)

  1. Descend to the leaf and remove the entry.
  2. If the node still has >= ceil(m/2) - 1 keys, done.
  3. If it underflows (< ceil(m/2) - 1 keys), restore the invariant using an adjacent sibling.
  4. Redistribute (borrow / rotate): if a sibling has more than the minimum, move one entry across and update the parent separator. Tree shape unchanged.
  5. Merge (coalesce): if no sibling can spare an entry, combine the node with a sibling and remove the now-unused separator from the parent.
  6. A merge removes a parent key, which can underflow the parent, so merge or redistribute propagates upward like splits. If the root ends with one child, that child becomes the new root and the tree shrinks in height.
Underflow decision: redistribute versus merge.
Condition after deleteActionEffect on tree
Node has >= ceil(m/2) - 1 keysNothingNo change
Underflow, adjacent sibling has > minimum keysRedistribute / borrow one entry, update parent separatorShape unchanged, no propagation
Underflow, no sibling above minimumMerge with sibling, drop separator from parentMay cascade upward; root may collapse and shrink height
Exam traps

Leaf split copies up, internal split pushes up; do not swap them. The root is exempt from half-full, so "every node is at least half full" is false. Height changes only on a root split (taller) or a root collapse after merge (shorter), never on an ordinary insert.

Leaf split, before and after

Before: insert 19 into a full leaf (m = 4) 23 . 41 11 . 14 . 17 . 19? full, no room for 19 After: split, copy 17 up as the new separator 17 23 41 11 . 14 17 . 19 (-> next leaf)
Figure 1. The leaf overflows, the upper half moves to a new right leaf, and its first key (17) is copied up into the parent. The data 17 still lives in the leaf. An internal split would instead push the middle key up and not keep it below.

Bulk loading (bottom up)

One-at-a-time insertion of N keys costs N descents and many random splits. If keys are sorted (or can be sorted first), build the tree bottom up instead.

  1. Sort the entries.
  2. Pack them into leaves left to right up to a target fill (the fillfactor, leaving slack for later inserts).
  3. Link the leaves into the chain.
  4. Build the parent level from the first key of each leaf; repeat upward until one root remains.

Result: sequential I/O, controlled page density, and a denser, shallower tree than repeated insertion. PostgreSQL B-tree fillfactor defaults to 90 (range 10 to 100).

Complexity

Page-read (I/O) cost in a B+tree of order m over N keys.
OperationI/O costNote
Point search (=)O(log_m N)One read per level; upper levels usually cached
Range scan (k matches)O(log_m N + k/leaf)One descent, then sequential leaf links
InsertO(log_m N)Dominated by descent; splits rare amortized (halves stay half full)
DeleteO(log_m N)Dominated by descent; merge or redistribute may cascade up
Bulk load N sorted keysO(N) sequentialPlus the sort if not already ordered
Heightabout log_m NFanout 100: 3 levels reach ~1M leaf entries, 4 levels ~100M
Keep this one thing

The log base is the fanout m, not 2. Wide nodes make a single page read a high-fanout decision, so the tree is 3 to 4 levels deep for hundreds of millions of rows. "O(log2 N) lookups" is the trap.

B-tree vs B+tree vs hash

When each access method wins. B+tree is PostgreSQL nbtree and SQLite table/index b-trees.
PropertyPlain B-treeB+treeHash index
Where data / pointers liveInternal nodes and leavesLeaves only; internals routeBuckets keyed by hash
Point lookup (=)O(log_m N) I/OO(log_m N) I/Oexpected O(1) I/O
Range / inequality (<, >, BETWEEN)possible but no linked leavesyes, ordered linked leavesno
ORDER BY without a sortin-order traversal, not leaf-linkedyes, walk the leaf chainno
Prefix of composite keyyesyes (leftmost prefix)no, needs the full key
Worst case under skew / growthbounded by heightbounded by heightlong overflow chains
In real systems

PostgreSQL nbtree is a B+tree on the Lehman-Yao design (right-links and high keys for lock-light search), with lazy two-stage deletion rather than eager textbook merge on every underflow. SQLite stores tables and indexes as b-trees with data only in leaves. PostgreSQL hash indexes are single column, equality only, no uniqueness, and crash-safe since v10 [PG hash docs].