Reputation: 1128
I'm brand new to C++ and have been asked to convert a Java program to C++. I'm trying to write a method to check that all elements in an unordered_set exist in another unordered_set. I found the example below using hash_set but hash_set is deprecated and it is recommended to use unordered_set now.
// returns true if one contains all elements in two
bool SpecSet::containsAll(hash_set<Species*> one, hash_set<Species*> two) {
sort(one.begin(), one.end());
sort(two.begin(), two.end());
return includes(one.begin(), one.end(), two.begin(), two.end());
}
So I need a way to do this using unordered_set. Sort does not work on unordered sets and lookup speed is important, so I don't want to use an ordered set.
bool SpecSet::containsAll(unordered_set<Species*> one, unordered_set<Species*> two) {
return ?;
}
I'd really appreciate some help with an approach to doing this efficiently.
EDIT: I guess this will work. It seems there is no more efficient way but to loop over all in two.
bool SpecSet::containsAll(unordered_set<Species*> one, unordered_set<Species*> two) {
if(two.size() > one.size())
{
return false;
}
for(Species *species : two)
{
if(one.find(species) == one.end())
{
return false;
}
}
return true;
}
Upvotes: 6
Views: 4201
Reputation: 4214
Disclaimer: This is not the most efficient approach. It is an attempt at solution that would be as generic and flexible as std::includes
while supporting unordered iterator ranges. It is not limited to std::unordered_set
and should work for any other container, e.g., std::vector
or std::list
.
As it was pointed out std::includes
requires the input ranges to be sorted. At this moment unordered ranges are not supported in the standard library.
Looking at possible implementations of std::includes
a version for unordered ranges can be implemented. For example like so:
template<class InputIt1, class InputIt2>
bool includes_unordered(
InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2)
{
for (; first2 != last2; ++first2)
{
InputIt1 it1;
for (it1 = first1; it1 != last1; ++it1)
{
if(*first2 == *it1)
break;
}
if (it1 == last1)
return false;
}
return true;
}
Note: containers' size-comparison optimization is not performed to support containers of non-unique objects. But if needed it can be done using std::distance
.
And here's a version taking an equivalence operator:
template<class InputIt1, class InputIt2, class Equivalence>
bool includes_unordered(
InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2,
Equivalence equiv)
{
for (; first2 != last2; ++first2)
{
InputIt1 it1;
for (it1 = first1; it1 != last1; ++it1)
{
if(equiv(*first2, *it1))
break;
}
if (it1 == last1)
return false;
}
return true;
}
Then includes_unordered
can be used in the same way as std::includes
would.
Upvotes: 1
Reputation: 30831
With unsorted collections, there's no faster algorithm than to iterate over the smaller collection while testing that each element is a member of the larger. This will naturally scale as O(n), where n is the size of the putative subset, since we perform an O(1) find operation n times.
Here's some demonstration code, with tests:
#include <unordered_set>
template <typename T>
bool is_subset_of(const std::unordered_set<T>& a, const std::unordered_set<T>& b)
{
// return true if all members of a are also in b
if (a.size() > b.size())
return false;
auto const not_found = b.end();
for (auto const& element: a)
if (b.find(element) == not_found)
return false;
return true;
}
int main()
{
const std::unordered_set<int> empty{ };
const std::unordered_set<int> small{ 1, 2, 3 };
const std::unordered_set<int> large{ 0, 1, 2, 3, 4 };
const std::unordered_set<int> other{ 0, 1, 2, 3, 9 };
return 0
+ is_subset_of(small, empty) // small ⊄ ∅
+ !is_subset_of(empty, small) // ∅ ⊂ small
+ is_subset_of(large, small) // large ⊄ small
+ !is_subset_of(small, large) // small ⊂ large
+ is_subset_of(large, other) // large ⊄ other
+ is_subset_of(other, large) // other ⊄ large
+ !is_subset_of(empty, empty) // ∅ ⊂ ∅
+ !is_subset_of(large, large) // x ⊂ x, ∀x
;
}
An equivalent, using standard algorithm instead of writing an explicit loop:
#include <algorithm>
#include <unordered_set>
template <typename T>
bool is_subset_of(const std::unordered_set<T>& a, const std::unordered_set<T>& b)
{
// return true if all members of a are also in b
auto const is_in_b = [&b](auto const& x){ return b.find(x) != b.end(); };
return a.size() <= b.size() && std::all_of(a.begin(), a.end(), is_in_b);
}
(obviously using the same main()
for tests)
Note that we pass the sets by reference, not by value, as you've indicated that the sets are too large for you to copy and sort them.
Upvotes: 1