Indiana Kernick
Indiana Kernick

Reputation: 5341

Is there an efficient algorithm for merging numeric ranges?

I am given series of ranges and I need to iterate each number in any of the ranges exactly once. The ranges may overlap and contain the same numbers.

The numbers in the range are

using Number = uint32_t;

Ranges are of this form

struct Range {
  Number first;
  Number last;
  Number interval;
};

Just to clarify the representation of Range.

Range range = {
  2,  //first
  14, //last
  3   //interval
};

//is equivalent to...

std::vector<Number> list = {2, 5, 8, 11, 14};

I have a few Ranges and I need to efficiently iterate all of the numbers in any order only once.

How do I efficiently iterate a set of ranges?

Also, Is there there a more efficient algorithm if interval is always 1?

Upvotes: 3

Views: 431

Answers (2)

Mic
Mic

Reputation: 331

If the sequences are very long you might like to just take each result in order, without storing the list, discarding duplicates as you go.

#include <vector>

// algorithm to interpolate integer ranges/arithmetic_sequences
template<typename ASqs, typename Action>
void arithmetic_sequence_union(ASqs arithmetic_sequences, Action action)
{
    using ASq = ASqs::value_type;
    using T = ASq::value_type;
    std::vector<ASq> remaining_asqs(begin(arithmetic_sequences), end(arithmetic_sequences));
    while (remaining_asqs.size()) {
        // get next value
        T current_value = **std::min_element(begin(remaining_asqs), end(remaining_asqs),
            [](auto seq1, auto seq2) { return *seq1 < *seq2; }
        );
        // walk past this value and any duplicates, dropping any completed arithmetic_sequence iterators
        for (size_t seq_index = 0; seq_index < remaining_asqs.size(); )
        {
            ASq &asq = remaining_asqs[seq_index];
            if (current_value == *asq // do we have the next value in this sequence?
                && !++asq) { // consume it; was it the last value in this sequence?
                remaining_asqs.erase(begin(remaining_asqs) + seq_index);//drop the empty sequence
            }
            else {
                ++seq_index;
            }
        }
        action(current_value);
    }
}

This wants the range presented in a "generator"-type object. Would probably look very like the implementation of checked a iterator, but iterators don't expose the notion of knowing they are at the end of the sequence so we might have to roll our own simple generator.

template <typename ValueType, typename DifferenceType>
class arithmetic_sequence {
public:
    using value_type = ValueType;
    using difference_type = DifferenceType;
    arithmetic_sequence(value_type start, difference_type stride, value_type size) : 
        start_(start), stride_(stride), size_(size) {}
    arithmetic_sequence() = default;
    operator bool() { return size_ > 0; }
    value_type operator*() const { return start_; }
    arithmetic_sequence &operator++() { --size_; start_ += stride_; return *this;}
private:
    value_type start_;
    difference_type stride_;
    value_type size_;
};

Test example:

#include "sequence_union.h"
#include "arithmetic_sequence.h"
#include <cstddef>
#include <array>
#include <algorithm>
#include <iostream>

using Number = uint32_t;

struct Range {
    Number first;
    Number last;
    Number interval;
};

using Range_seq = arithmetic_sequence<Number, Number>;


Range_seq range2seq(Range range)
{
    return Range_seq(range.first, range.interval, (range.last - range.first) / range.interval + 1 );
}

int main() {
    std::array<Range, 2> ranges = { { { 2,14,3 },{ 2,18,2 } } };
    std::array<Range_seq, 2> arithmetic_sequences;
    std::transform(begin(ranges), end(ranges), begin(arithmetic_sequences), range2seq);

    std::vector<size_t> results;
    arithmetic_sequence_union(
        arithmetic_sequences,
        [&results](auto item) {std::cout << item << "; "; }
    );

    return  0;
}

// output: 2; 4; 5; 6; 8; 10; 11; 12; 14; 16; 18;

Upvotes: 0

Daniel Jour
Daniel Jour

Reputation: 16156

For each range, remember the "current" value (going from first to last with the step size). Put that along with the range in a priority queue, sorted after the current value.

Take the top out, if its current value is different from the last, then use it. Then, insert the next step if there is another.

Assumes positive step size.

template<typename Iterator, typename Operation>
void iterate_ranges (Iterator from, Iterator to, Operation op) {
  using R = typename std::iterator_traits<Iterator>::value_type;
  using N = typename std::decay<decltype(std::declval<R>().first)>::type;
  using P = std::pair<N, R>;
  auto compare = [](P const & left, P const & right) {
    return left.first > right.first;};

  std::priority_queue<P, std::vector<P>, decltype(compare)> queue(compare);

  auto push = [& queue] (P p) {
    if (p.first < p.second.last) queue.push(p); };
  auto next = [](P const & p) -> P {
    assert(p.second.step > 0);
    return {p.first + p.second.step, p.second}; };
  auto init = [&push] (R const & r) {
    push({r.first, r}); };

  std::for_each(from, to, init);

  if (queue.empty()) return;

  N last = queue.top().first;
  push(next(queue.top()));
  queue.pop();
  op(last);

  while (! queue.empty()) {
    P current = queue.top();
    queue.pop();
    if (current.first != last) {
      op(current.first);
      last = current.first;
    }
    push(next(current));
  }
}

Memory requirement: linear in the number of ranges. Time requirement: sum of all step counts within each range.

Small example:

struct Range {
  int first;
  int last;
  int step; // a better name ...
};


int main() {
  Range ranges [] = {
    {1, 10, 2},
    {2, 50, 5}};

  auto print = [](auto n) { std::cout << n << std::endl; };

  iterate_ranges(std::begin(ranges), std::end(ranges), print);
}

To get all numbers in a vector, use a lambda with a reference to a vector and push back each one.

Is there there a more efficient algorithm if interval is always 1?

You could add that as a special case, but I don't think it will be necessary. If you only got ~50 ranges, then above push won't be that expensive. Though, with all optimisation: profile first!

Upvotes: 3

Related Questions