user1382306
user1382306

Reputation:

unordered_map hash function return

I'd like to have an unordered_map with a struct, that I'd like to use as the key, of multiple std::set< std::string >.

I see that a custom hash function is required and that a string can have std::hash applied; however, I cannot determine what should be returned to satisfy the purpose of the hash function of these sets for an unordered_map.

How should a custom hash function return?

Upvotes: 1

Views: 1093

Answers (2)

Felix Glas
Felix Glas

Reputation: 15524

The requirements of std::hash is as follows: (http://en.cppreference.com/w/cpp/utility/hash)

The hash template defines a function object that implements a hash function. Instances of this function object satisfy Hash. In particular, they define an operator() that:

  1. Accepts a single parameter of type Key.
  2. Returns a value of type size_t that represents the hash value of the parameter.
  3. Does not throw exceptions when called.
  4. For two parameters k1 and k2 that are equal, std::hash<Key>()(k1) == std::hash<Key>()(k2).
  5. For two different parameters k1 and k2 that are not equal, the probability that std::hash<Key>()(k1) == std::hash<Key>()(k2) should be very small, approaching 1.0 / std::numeric_limits<size_t>::max().

The hash template is both CopyConstructible and Destructible.

So what you need is basically a function that returns a std::size_t that is unique for every myStruct object and that returns the same value for objects that are considered equivalent.

Edit: The following may not be the most robust way to generate the hash, but it will serve as a basic example of how it can be done.

One way to do it is to use the standard specialization for std::hash<std::string> by concatenating all the strings in each std::set member using a separator sequence and then concatenating all the resulting merged strings into one and returning the hash value using the standard hash function.

The merged "super"-string will be unique for each myStruct object if the member std::sets differ, and still same when the members don't differ as std::set is an ordered container.

struct myStruct {
    std::set<std::string> s1;
    std::set<std::string> s2;
};

std::string mergeAllStrings(const myStruct& ms) {
    static const std::string SEPARATOR = "#¤%&"; // Some uncommon sequence.
    std::string super;
    for (const auto& s : ms.s1) {
        super += s + SEPARATOR; // Append separator.
    }
    for (const auto& s : ms.s2) {
        super += s + SEPARATOR; // Append separator.
    }
    return super;
}

int main() {
    myStruct ms1{{"apple", "pear", "orange"}, {"red", "green", "blue"}};
    myStruct ms2{{"pear", "apple", "orange"}, {"red", "green", "blue"}};
    myStruct ms3{{"apple", "banana", "orange"}, {"red", "green", "blue"}};

    std::cout << std::hash<std::string>()(mergeAllStrings(ms1)) << std::endl;
    std::cout << std::hash<std::string>()(mergeAllStrings(ms2)) << std::endl;
    std::cout << std::hash<std::string>()(mergeAllStrings(ms3)) << std::endl;
}

Output:

2681724430859750844 // Same
2681724430859750844 // Same
2942368903851914580 // Different

You can now create a hash functor e.g. as:

struct MyHash {
    std::size_t operator()(const myStruct& ms) const {
        return std::hash<std::string>()(mergeAllStrings(ms));
    }
};

and use it with std::unordered_map as:

std::unordered_map<myStruct, myValue, MyHash> m;

Note that you should provide a custom equal_to functor as well.

Upvotes: 1

Blastfurnace
Blastfurnace

Reputation: 18652

I think this may be a better alternative to Snps' answer. This implements a specialization of std::hash for a user-defined type and it hashes the struct without creating temporary strings.

I've copied two functions from Boost, hash_combine and hash_range, to compute a single hash value from two containers.

#include <iostream>
#include <functional>
#include <set>
#include <unordered_map>

// user-defined type
struct myStruct {
    std::set<std::string> s1;
    std::set<std::string> s2;

    bool operator==(const myStruct &other) const {
        return (s1 == other.s1) && (s2 == other.s2);
    }
};

// hash helper functions plagiarized from Boost
template <typename T>
void hash_combine(size_t &seed, const T &v)
{
    using std::hash;
    seed ^= hash<T>()(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
}

template <typename It>
void hash_range(size_t &seed, It first, It last)
{
    for (; first != last; ++first) {
        hash_combine(seed, *first);
    }
}

// std::hash specialization
namespace std
{
    template<> struct hash<myStruct> {
        size_t operator()(const myStruct &key) const {
            size_t seed = 0;
            hash_range(seed, key.s1.begin(), key.s1.end());
            hash_range(seed, key.s2.begin(), key.s2.end());
            return seed;
        }
    };
}

int main()
{
    std::unordered_map<myStruct, int> myMap;

    myStruct ms1{ { "apple", "pear", "orange" }, { "red", "green", "blue" } };
    myStruct ms2{ { "pear", "apple", "orange" }, { "red", "green", "blue" } };
    myStruct ms3{ { "apple", "banana", "orange" }, { "red", "green", "blue" } };

    myMap[ms1] = 1;
    myMap[ms2] = 2;
    myMap[ms3] = 3;

    std::cout << myMap.size() << '\n'; // output: 2
}

Upvotes: 2

Related Questions