Reputation:
How does Trie and B+ tree compare for indexing lexicographically sorted strings [on the order some billions]? It should support range queries as well.
From perf. as well as implementation complexity point of view.
Upvotes: 20
Views: 9334
Reputation: 9945
Depends on your actual task:
N
children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario. Upvotes: 4
Reputation: 299930
I would say it depends on what you mean by Range.
If your range is expressed as All words beginning by, then a Trie
is the right choice I'd say. On the other hand, Trie
are not meant for requests like All words between XX and ZZ.
Note that the branching factor of the B+ Tree
affects its performance (the number of intermediary nodes). If h
is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).
With n = 1 000 000 000
and b = 100
, we have h ~~ 5
. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie
.
Finally, B+ Tree
is admittedly more difficult to implement than a Trie
: it's more on a Red-Black Tree
level of complexity.
Upvotes: 18