andandandand
andandandand

Reputation: 22270

LSM Tree lookup time

What's the worst case time complexity in a log-structured merge tree for a simple search query (like querying a single WHERE clause)?

Is it O(log N)? O(N*Log N)? Something else?

How about for a multiple query, like searching for multiple WHERE clauses in a key-value database?

The wikipedia page on LSM trees is currently lacking this info.

And I'm trying to make sense of the original paper.

Upvotes: 3

Views: 1620

Answers (2)

AndyF
AndyF

Reputation: 193

I have been wondering the same.

If you have a series of trees, getting smaller by a constant factor each time, and you need to search them all for a single key, the cost seems to be O(log(N)^2).

Say the first (binary) tree takes log_2(N) branches to reach a node. The second might be half the size, and take (log_2(N) - 1) branches to find a node. The smallest tree will be some O(1) constant in size and there are roughly log_2(N) trees total. Summing the series gives O(log_2(N)^2).

However, I'm wondering if there is some more clever scheme where arbitrary single-key lookups, insertions or deletions have amortized cost O(log(N)), but haven't been able to find an answer (yet).

Upvotes: 1

Warren Dew
Warren Dew

Reputation: 8928

For a simple search indexed by a LSM tree, it is O(log n). This is because the biggest tree in the LSM tree is a B tree, which is O(log n), and the other trees are subsets of B trees or in the case of in memory trees, more efficient trees, which are no worse than O(log n). The number of trees is a constant, so it doesn't affect the order of the search time.

Upvotes: -1

Related Questions