Borys
Borys

Reputation: 3034

B+ tree or B-tree

I am learning about postgresql internals and I am wondering or postgresql B-tree index is actually classic B-tree or B+tree? To spell it out, that means the nodes contain only keys or key-value pairs?

Upvotes: 15

Views: 8159

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656391

I said B-trees, first, but it's arguably closer to B+ trees.
See iwis' answer discussing it more thoroughly.

You would really have to consider index + heap (+ auxiliary storage) together. An index is mostly useless on its own.

Here is a related chapter on Wikipedia.

The name of the relevant index method is "B-tree" in Postgres. Physical storage is very similar to that of tables (the heap) or any other index type. All use the same data pages with mostly the same page-layout. More in the manual.

Development is ongoing. The design has been changed (improved) in many aspects since this question was asked. The latest notable change (as of April 2021) being deduplication in Postgres 13.

Upvotes: 11

iwis
iwis

Reputation: 1712

It seems to me that PostgreSQL uses B+ tree.

The difference between B-tree and B+ tree

  • In B-tree the pointers to records in the indexed table are not only in the leaves of the tree, but also in all internal nodes of the tree.
  • In B+ tree the pointers to records in the indexed table are only in the leaves of the tree. The advantages of B+ tree over B-tree are described here.

the picture is a modification (the picture is a modification of this picture)

Usage of B+ tree in DBMSs

Oracle, SQL Server, SQLite, DB2, and MySQL use B+ tree. It seems that also PostgreSQL uses B+ tree because:

  • The documentation seems to state that only the leaves of the tree have the pointers to records in the indexed table:

    Each leaf page contains tuples that point to table rows. Each internal page contains tuples that point to the next level down in the tree.

  • When Bruce Momjian says about internal nodes he doesn't mention that they have pointers to records in the indexed table.

  • The src/backend/access/nbtree/README file of the PostgreSQL source code mentioned in the documentation contains this comment:

    Btree Indexing

    This directory contains a correct implementation of Lehman and Yao's high-concurrency B-tree management algorithm (P. Lehman and S. Yao, Efficient Locking for Concurrent Operations on B-Trees, ACM Transactions on Database Systems, Vol 6, No. 4, December 1981, pp 650-670).

    Lehman and Yao use a tree structure named B* tree, which was defined by Wedekind in On the selection of access paths in a data base system paper as B-tree where non-leaf nodes don't have pointers to records in the indexed table (they have pointers only to their children nodes). So the B* tree structure defined by Wedekind is a B+ tree.

Upvotes: 15

Related Questions