Reputation: 185
I have the following vectors:
vector<unsigned> A,B1;
vector<pair<unsigned,unsigned> > B2;
I want to perform the intersection of (A,B1)
and then the intersection of (A,B2)
. I want to then perform the union of the two intersection results. Vectors A
and B1
contain sorted unsigned integers and vector B2
contains pairs of (start,end)
values. Example vectors A
and B2
, and their intersection vector is shown below:
vector<unsigned> A(2,4,6,8,9,10,34,74,79,81,89,91,95);
vector<pair<unsigned,unsigned> > B2={ {2, 3}, {29, 40}, {60, 85} };
vector<unsigned> intersection; //result of intersection of A and B2 -> Procedure of performing intersection is explained below
//intersection=(2,34,74,79,81);
2
is in intersection as 2
is in {2,3}
. Similarly 34
is in intersection as 34
lies between {29,40}
. Similarly 74
, 79
, 81
are in intersection as they lie within B2
's last element {60,85}
's range.
Is there some efficient way by which I may get the same results as:
(1). intersection A
and B1
;
(2). intersection of A
and B2
;
(3). union of the two intersections performed in step 1 and 2 (i.e. intersection (A,B1)
and (A,B2)
)
Upvotes: 1
Views: 915
Reputation: 69922
Update:
After re-reading the question I realised that you probably wanted to do all three operations in one.
We can reduce the number of iterations of the outer loop to one and remove one inner loop by composing three functions into one and returning a tuple:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <tuple>
std::tuple<std::vector<unsigned>, std::vector<unsigned>, std::vector<unsigned>>
find_everything(const std::vector<unsigned>& a,
const std::vector<std::pair<unsigned,unsigned> >& b1,
const std::vector<std::pair<unsigned,unsigned> >& b2)
{
std::vector<unsigned> r1, r2, both;
for (auto x : a)
{
auto either = false;
auto i = std::find_if(std::begin(b1),
std::end(b1),
[x](const auto& range)
{ return x >= range.first && x < range.second; });
if (i != std::end(b1)) {
either = true;
r1.push_back(x);
}
i = std::find_if(std::begin(b2),
std::end(b2),
[x](const auto& range)
{ return x >= range.first && x < range.second; });
if (i != std::end(b2)) {
either = true;
r2.push_back(x);
}
if (either) {
both.push_back(x);
}
}
return std::make_tuple(std::move(r1), std::move(r2), std::move(both));
}
int main()
{
using namespace std;
vector<unsigned> A { 2,4,6,8,9,10,34,74,79,81,89,91,95 };
vector<pair<unsigned,unsigned> > B1={ {4, 5}, {8, 10}, {90, 99} };
vector<pair<unsigned,unsigned> > B2={ {2, 3}, {29, 40}, {60, 85} };
auto results = find_everything(A, B1, B2);
const auto& r1 = std::get<0>(results);
const auto& r2 = std::get<1>(results);
const auto& both = std::get<2>(results);
copy(begin(r1), end(r1), ostream_iterator<unsigned>(cout, ", "));
cout << endl;
copy(begin(r2), end(r2), ostream_iterator<unsigned>(cout, ", "));
cout << endl;
copy(begin(both), end(both), ostream_iterator<unsigned>(cout, ", "));
cout << endl;
return 0;
}
expected results:
4, 8, 9, 91, 95,
2, 34, 74, 79, 81,
2, 4, 8, 9, 34, 74, 79, 81, 91, 95,
Further work:
There are two immediately obvious improvements we could make if the datasets are large:
Since A, B1 and B2 are sorted, we can track "current" match iterators, reducing the search space on each match or non-match (this starts to argue for std::lower_bound
)
or if A is significantly larger than B1 and B2, we could parallelise the search.
Have fun :)
Upvotes: 1