Reputation: 725
I want to allow the user to be able to define ranges that will filter data. The ranges defined could be contiguous, over lapping, or separated (e.g. the user enters the following ranges: 1-10, 5-10, 10-12, 7-13, and 15-20).
I then want to filter the data so that the user is only displayed what is within those ranges.
I will probably create code on a different layer that will combine the ranges where appropriate (so the above example will become 1-13 and 15-20, but I don't want my data service to be concerned with that, so it must be able to handle the example above)
I have a lot of data and speed is a priority, so I don't want to have to iterate through a list of ranges for each data item to check if it should be displayed to the user or not.
Is there a data structure (or some algorithm) that could be used to achieve this?
Upvotes: 4
Views: 1598
Reputation: 32182
It's not too hard if your data is allready sorted. Use a combination of
For each of your subranges [min, max] you can find the iterators i_min and i_max and use them as
std::make_pair(i_min, i_max)
to make it "range" compatible. Then use boost::join to concatenate all the sub ranges into a single range ( lazily of course ) and then use this range in down stream processing.
Obviously you should pre-process all your ranges to make sure they are non overlapping.
Upvotes: 0
Reputation: 9130
I think what you want to do is known as Range Minimum Query.
Upvotes: 0
Reputation: 12047
One approach would be to combine ranges you receive and map them to an underlying bitmap indicating in or not in the range.
A class-based design would allow you to overload operator +=
for syntactic sugar, but a bare bitmap would work just as well. For instance:
# original bitmap
bits = [ 0,0,0,0,0,0,0,0,0,0 ]
# add 1-5
bits = [ 0,1,1,1,1,1,0,0,0,0 ]
# add 4 - 6
bits = [ 0,1,1,1,1,1,1,0,0,0 ]
# Look for 3
bits[3] == 1 ?
Upvotes: 0
Reputation: 82
Solution in general depends on range bounds.
max
-min
is not so huge (for example, you define bounds in [1..1024]), you could use just an array, which point every X to the list of ranges. For your example, the array should be:ranges=[0:(1,10), 1:(5,10), 2:(10,12), 3:(7,13), 4:(15-20)] points=[1:[0],2:[0],3:[0],4:[0],5:[0,1],...,7:[0,1,3],...10:[0,1,2,3],...15:[4],...20:[4],21:[]...]
So, in this case you could quicly determine the ranges for particular X.
Upvotes: 0
Reputation: 35667
Make your list disjoint (as you suggested), combining ranges which overlap. Then sort the array of endpoints, and for each data element, do a binary search and determine whether it's within a range or outside it. Even elements will always begin a range, odd elements will always end a range.
HTH.
Upvotes: 0
Reputation: 146930
You can use iterators into your containers. For example, std::vector provides the "at" method. These iterators can be contiguous, overlapping, or separated.
Upvotes: 0
Reputation: 283684
If you sort your list of ranges, you can then use binary search to minimize iteration. But really, unless you have an large number of ranges, iteration will be fastest.
Upvotes: 0