Reputation: 6084
Today I was asking myself what might be the shortest possible code to obtain all values in a sorted vector std::vector<double>
, which are greater or equal to a
and smaller or equal to b
.
My first approach was something like the following:
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>
// Returns all values in sortedValues being greater equal start and smaller equal end;
std::vector<double> cutValues(const std::vector<double>& sortedValues, double start, double end) {
std::vector<double> ret;
auto startIter=std::lower_bound(sortedValues.begin(), sortedValues.end(), start);
auto stopIter = std::upper_bound(sortedValues.begin(), sortedValues.end(), end);
std::copy(startIter, stopIter, std::back_inserter(ret));
return ret;
}
int main(int argc, char **args) {
{
auto ret = cutValues({ 0.1,0.2,0.3 }, 0.1, 0.3);
std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
std::cout << std::endl;
}
{
auto ret = cutValues({ 0.12,0.2,0.31 }, 0.1, 0.3);
std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
std::cout << std::endl;
}
{
auto ret = cutValues({ 0.1,0.2,0.3 }, 0.2, 0.2);
std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
std::cout << std::endl;
}
}
My second thought was something very simple like the following:
std::vector<double> cutValues2(const std::vector<double>& sortedValues, double start, double end) {
std::vector<double> ret;
std::copy_if(sortedValues.begin(), sortedValues.end(), std::back_inserter(ret), [&start, &end](auto v) { return v >= start && v <= end; });
return ret;
}
But considering the case of only removing small portions from a very large vector it might have some efficiency issues.
Now I'm asking myself, if there there are even better ways to do it?
Upvotes: 8
Views: 1090
Reputation: 126787
Not shorter, but generally more efficient: use std::lower_bound
to find the start of the "interesting" interval, and go on copying as long as the elements are less than your maximum value; that way you are locating quickly (O(log N)) the start even for big vectors and you don't waste time doing a binary search again looking for the end of the interval - you are going to find it anyway during the copy loop.
The possibly extra compares are virtually free anyway, as double
compares themselves are extremely cheap, the branch is well predicted (it's always the same until the end of the copy) and depends on data already in cache; compare with an extra binary search, which "jumps around" in memory (worse cache locality) and has less predictable branches.
std::vector<double> cutValues(const std::vector<double>& sortedValues, double minv, double maxv) {
std::vector<double> ret;
auto end = sortedValues.end();
for(auto it = std::lower_bound(sortedValues.begin(), end, minv);
it != end && *it <= max; ++it) {
ret.push_back(*it);
}
return ret;
}
This can be rewritten in a similar way to your cutValues2
using STL algorithms, but I'm not sure it's much of an improvement.
std::vector<double> cutValues(const std::vector<double>& sortedValues, double minv, double maxv) {
std::vector<double> ret;
std::copy_if(
std::lower_bound(sortedValues.begin(), sortedValues.end(), minv),
sortedValues.end(),
std::back_inserter(ret),
[=](double v) { return v <= maxv; });
return ret;
}
Upvotes: 2
Reputation: 2765
Once you know where your interval starts, there is no need to look through the hole vector again to find the end iterator. Start searching after the start_iterator
instead
auto startIter = std::lower_bound(sortedValues.begin(), sortedValues.end(), start);
if (startIter == sortedValues.end()) {
// either the last element was inside the interval (return it) or not (return empty vector)
return *(--startIter) == start ? {start} : {};
}
auto stopIter = std::upper_bound(++startIter, sortedValues.end(), end);
// start from here
std::copy(--startIter, stopIter, std::back_inserter(ret));
Other optimizations would be content dependent. E.g. If you know that the last element inside your interval would be rather at the end of the vector it would make more sense to iterate over it backwards.
You may also consider checking the interval first to prevent unnecessary execution if a is greater or equal to b.
Upvotes: 0
Reputation: 9117
A bit changed first version:
std::vector<double> cutValues(const std::vector<double>& sortedValues, double start, double end) {
auto startIter = std::lower_bound(sortedValues.begin(), sortedValues.end(), start);
auto stopIter = std::upper_bound(startIter, sortedValues.end(), end);
return std::vector<double>(startIter, stopIter);
}
Upvotes: 6
Reputation: 15524
You could write your function STL-style and make it work with any container of any type (and it will be more elegant imho).
template <typename ForwardIt, typename T>
auto bounded_range(ForwardIt first, ForwardIt last, T lb, T ub) {
return std::pair{std::lower_bound(first, last, lb), std::upper_bound(first, last, ub)};
}
std::vector<double> v1{0.1, 0.1, 0.2, 0.3, 0.3, 0.5, 0.7, 0.9};
auto [beg, end] = bounded_range(std::begin(v), std::end(v), 0.3, 0.5);
std::vector<double> v2{beg, end};
// Or using 'std::make_from_tuple'.
auto v3 = std::make_from_tuple<std::vector<double>>(bounded_range(std::begin(v), std::end(v), 0.3, 0.5));
Note: examples uses C++17 template argument deduction for class templates, and structured bindings.
Upvotes: 1