Cimbali
Cimbali

Reputation: 11395

A container for integer intervals, such as RangeSet, for C++

I am trying to work with ranges, as in ranges of numbers. By that I mean integer intervals, in maths speak. And I want to store a set of them. I also want this set to naturally merge (or coalesce) ranges I insert.

Let's go for a simple example, I start with an empty set: { }

I found out thanks to a similar question that this exists in Java as a Google Guava library and is called a RangeSet.

I was initially thinking of using a std::set of std::pairs that would be sorted on the lower bound (so the first element of each pair). Then after each insertion I would have to manually merge any overlapping sets.

As this seems a common problem, is there a good implementation I couldn't find due to the noise with all the synonyms of "range" in C++? Or does anyone care to share their own? I only want to print the final ranges but bonus points for generality if you have other set operations.

Upvotes: 17

Views: 12178

Answers (6)

Cimbali
Cimbali

Reputation: 11395

Credit to Arne Vogel for noting that a set of pairs indexed on their first element is really a map.

Hence pretty close to my initial thoughts and to Useless' answer (except simple on comparing bounds); we get this:

typedef std::pair<int, int> Range;

class RangeSet : public std::map<int, int> {
public:

std::pair<RangeSet::iterator, bool> insert(const Range& range) {
    assert(range.first <= range.second);

    RangeSet::iterator after = upper_bound(range.first), insert_range;

    if (after == begin() or std::prev(after)->second < range.first) {
        insert_range = std::map<int, int>::insert(after, range);
    }   
    else {
        insert_range = std::prev(after);
        if (insert_range->second >= range.second) {
            return std::pair<RangeSet::iterator, bool>(insert_range, false);
        }   
        else {
            insert_range->second = range.second;
        }   
    }   

    while (after != end() and range.second >= after->first) {
        insert_range->second = std::max(after->second, insert_range->second);
        after = erase(after);
    }   

    return std::pair<RangeSet::iterator, bool>(insert_range, true);
}   

};

With the boolean returned being true iff there is at least one element added to the total set.

Upvotes: 3

hl037_
hl037_

Reputation: 3867

I have written such a library for my use, it's extensively tested, and available here : https://github.com/hl037/rangeset.hpp

It's STL compatible and quite straight forward to use. It is based on the idea from Ben Voigt, but without the sub-ranges part.

Upvotes: 0

ulatekh
ulatekh

Reputation: 1490

Here is a GPL-v2-licensed class I wrote a long time ago that does intervals. It's actually a 2D-region class. It also supports flood-fill.

The programming style is somewhat idiosyncratic, and it does a lot more than you asked for, but you may find it helpful.

Upvotes: 1

Useless
Useless

Reputation: 67723

It turns out to be reasonably straightforward to implement your first idea.

To insert:

  1. use lower_bound to find the preceding range, but by comparing the new range's lower bound to the old ranges' upper bounds.

    This only works because the upper bounds have exactly the same ordering as the lower bounds, so we preserve the correct sequence

  2. if you overlap the returned range, merge in-place (just mutate it), otherwise insert

  3. while your current range (whether mutated in-place or newly inserted) overlaps the subsequent range, merge them

that is, you only need to merge in one direction.

Note that this requires an unusual comparison which isn't a strict weak ordering (if range A is wholly contained in range B, you probably get A < B < A), so I'm not sure that putting it in a std::set will work - just maintaining a sorted list or vector should be easy enough though.

Upvotes: 1

Ben Voigt
Ben Voigt

Reputation: 283614

If you encode ranges as a sequence of endpoints and step direction, instead of start/end pairs, then finding union should become much easier, just a simple mergesort.

(0, +) (5, -)

(0, +) (5, -) (10, +) (15, -)

(0, +) (5, +) (5, -) (7, -) (10, +) (15, -)

Look, the overlapping range appears as nested ranges. Just preserve only the outermost ones.

(0, +) (5, +) (5, -) (7, -) (10, +) (15, -)
   1      2      2     1       1       1       <= depth
(0, +) (7, -) (10, +) (15, -)

(0, +) (7, -) (10, +) (12, +) (15, -) (17, -)
   1      1      1       2       2       1
(0, +) (7, -) (10, +) (17, -)


(0, +) (6, +) (7, -) (10, +) (13, -) (17, -)
   1      2      2      2       2       1
(0, +) (17, -)

I think that finding intersections becomes simple also, now you preserve only endpoints with a nesting level of 2, instead of deleting them.

Upvotes: 10

Jens
Jens

Reputation: 9406

Boost has the Interval Container Library (ICL). If you want to do computations on intervals, e.g. represent sin(I) for an interval I, there is also a Interval Arithmetic Library in boost.

Upvotes: 6

Related Questions