Reputation: 745
I'm looking to hash an unordered container, such as an unordered_map
and unordered_set
.
For an ordered type, like a vector, boost::hash_range(v.begin(). v.end())
works well, but it is also dependent on order, e.g.
#include <boost/functional/hash.hpp>
#include <functional>
namespace std {
template<>
struct hash<std::vector<int>> {
size_t operator ()(const std::vector<int>& v) const noexcept {
return boost::hash_range(v.begin(), v.end());
}
};
}
Example of this working: https://coliru.stacked-crooked.com/a/0544c1b146ebeaa0
boost.org says
If you are calculating a hash value for data where the order of the data doesn't matter in comparisons (e.g. a set) you will have to ensure that the data is always supplied in the same order.
Okay, so that would seem easy - just sort the data in some way, but I don't want to do this every time I hash it. Using a normal map
or set
could work but I would need to do a fair bit of re-writing.
Additionally, this would require every type I use to have either >
, <
, <=
or >=
defined, as well as ==
and std::hash
.
How can I hash a container so that the order does not matter?
Upvotes: 2
Views: 647
Reputation: 416
After sorting the hash values of individual container elements, the sorted list of hash values can be hashed again to obtain a hash value for the unordered container.
Assume H1
is the hash function for a single element and H2
is the hash function for a list of hash values, then the hash value for some unordered container with elements A, B, and C could be calculated as H2(SORT(H1(A), H1(B), H1(C)))
. By construction, the resulting hash value will be independent of the order. In this way, you will also get a stronger hash value compared to combining the individual hash values using commutative operations.
Upvotes: 2
Reputation: 133577
The requirement seems rather logical, since the hash function is combining the previous elements hash with the current element hash somehow, then the order is important, because
H(A, B, C)
is then computed as H(H(H(A), B), C)
so that each intermediate result is used as input to the next element (think about a block cipher).
To hash a sequence of elements without caring for ordering you'd need a commutative hash function, so you'd be restricted to commutative operations (eg. XOR). I'm not sure how strong could be such an hash function but for your specific scenario it could be sufficient.
Upvotes: 2