Reputation: 7883
I need to store a set of ordered integers which can form sequences like the following
1,2,3,15,16,20,21,25,26,27,28
It can also be represented as
1-3,15-16,20-21,25-28
I don't need the sequences to be ordered I just need to be able to add integers and know if some integer is in the set.
I'm looking for datastructure that is fast O(lg(n))
or O(n*lg(n))
in terms of insertion and search ie. is X in the set of integers, that can handle concurrent read-write and if possible write-write without locks and without persistence.
For same insertion and search time, the more space efficient solution will be choosen.
A binary search tree but it is not good enough because since integers are inserted in ever growing ascending orderd the generated tree doesn't look good, so I think I need a multiversion self balancing tree.
There is no duplicates.
No code is needed just an explanation with references will do the job.
Background: This is for a mvcc database, each transaction has a transaction id which should be unique while being ordered ie. for two T1(t1), T2(t2), id(T1) < id(T2). The gaps comes from the fact a transaction doesn't commit its transaction id is lost. Transaction ids are used to annotate data versions, to know if a version of a data should be considered and how, you must know at least if it's commited for that I must maintain a list of commited transaction, a hash map of intergers can do the job perfectly for a POC but not in the long run. I don't know how professionnal dbs do that...
Similar question which can be a bit misleading: Finding a gap in an ordered range of adjacent numbers
Upvotes: 1
Views: 1140
Reputation: 363757
I suggest an interval tree -- that's an amended binary search tree that compresses intervals. The problem of ordered insertion can be handled by using a self-balancing variant. Concurrency support can be achieved either with locks or by implementing a persistent version.
Upvotes: 2