user256717
user256717

Reputation:

Trie vs B+ tree

How does Trie and B+ tree compare for indexing lexicographically sorted strings [on the order some billions]? It should support range queries as well.

From perf. as well as implementation complexity point of view.

Upvotes: 20

Views: 9334

Answers (3)

Denis Bazhenov
Denis Bazhenov

Reputation: 9945

Depends on your actual task:

  • If you want to get the whole subtree, a B+Tree is your best choice because it is space efficient.
  • But if you want to get the first N children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario.
  • The most popular task which is well-handled by a Trie is a word prefix completion.

Upvotes: 4

Matthieu M.
Matthieu M.

Reputation: 299930

I would say it depends on what you mean by Range.

If your range is expressed as All words beginning by, then a Trie is the right choice I'd say. On the other hand, Trie are not meant for requests like All words between XX and ZZ.

Note that the branching factor of the B+ Tree affects its performance (the number of intermediary nodes). If h is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).

With n = 1 000 000 000 and b = 100, we have h ~~ 5. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie.

Finally, B+ Tree is admittedly more difficult to implement than a Trie: it's more on a Red-Black Tree level of complexity.

Upvotes: 18

thSoft
thSoft

Reputation: 22660

Wikipedia has some algorithmic complexity facts: B+ tree (section Characteristics), Trie (unfortunately spread all over the article). Hope that helps.

Upvotes: -1

Related Questions