emu
emu

Reputation: 1673

Is it acceptable to use std::merge for overlapping ranges

I have an algorithm which requires applying set union many times to growing sets of integers. For efficiency I represent the sets as sorted vectors, so that their union can be obtained by merging them.

A classical way to merge two sorted vectors is this:

void inmerge(vector<int> &a, const vector<int> &b) {
  a.reserve(a.size() + b.size());
  std::copy(b.begin(), b.end(), std::back_inserter(a));
  std::inplace_merge(a.begin(), a.end() - b.size(), a.end());
}

Unfortunately, std::inplace_merge appears to be much slower than std::sort in this case, because of the allocation overhead. The fastest way is to use std::merge directly to output into one of the vectors. In order not to write a value before reading it, we have to proceed from the ends, like this:

void inmerge(vector<int> &a, const vector<int> &b) {
  a.resize(a.size() + b.size());
  orig_a_rbegin = a.rbegin() + b.size();
  std::merge(orig_a_rbegin, a.rend(), b.rbegin(), b.rend(), a.rend(), [](int x, int y) { return x > y; });
}

It is for sure that an implementation of merge will never write more elements than it has read, so this is a safe thing to do. Unfortunately, the C++ standard (even C++17 draft) forbids this:

The resulting range shall not overlap with either of the original ranges.

Is it okay to ignore this restriction if I know what I'm doing?

Upvotes: 1

Views: 291

Answers (2)

Jesper Juhl
Jesper Juhl

Reputation: 31457

To state it simply: No.

A bit longer: If you ignore a mandate by the standard you end up in Undefined Behaviour land and your compiler is free to do whatever it wants. This includes doing exactly what you expect, doing nothing at all, crashing the program, deleting all your files or summoning nasal demons. That's not a place you want to be.

Upvotes: 1

Sebastian Redl
Sebastian Redl

Reputation: 71899

No, ignoring a mandate of the standard (or any other documentation of some library you're using) is never ok. You may know what you are doing, but are you sure you know what the library is doing - or might be doing in the next version?

For example, the merge algorithm could detect that at least two of your ranges are reverse ranges, unwrap them (and unwrap or reverse the third), and do the merge in the other direction. No observable difference as long as the preconditions are kept, but possibly a tiny bit faster since the overhead of the reverse iterators is gone. But it would really screw with your code.

Upvotes: 4

Related Questions