Bjarke H. Roune
Bjarke H. Roune

Reputation: 3727

Find bucket in unordered_map from hash without a key

I am using a std::unordered_map. I have a hash value and a way to determine if a given candidate key is the key that I am looking for, but I do not have an actual key. I want to look up the bucket corresponding to the hash value and go through each element in that bucket to see if it is the element that I am looking for. Unfortunately, the function std::unordered_map::bucket(x) requires x to be a key. Is there really no way to get a bucket from a hash value without first constructing a key?

Details that you don't need to answer the question: I could construct the key but in the common case of no collisions this will take longer than only checking if the single candidate I have found in the bucket is the right one. I have a low load factor so there are few collisions and even for a collision the full hash value is unlikely to match, so non-matches are quickly determined not to match. I care about this because I have determined with a profiler that key construction is taking a significant amount of time - there are many lookups and each lookup requires the construction of a key.

Even more details that you really don't need to answer the question: The keys are vectors of integers and my query is for the sum of two vectors. It is faster to check if a given vector V is the sum of two vectors A and B than to sum the two vectors into a third vector C=A+B and then compare C to V. I am able to determine the hash value of A+B without calculating the actual vector A+B because I store the hash values of these vectors and my hash function f has the property that f(A+B)=f(A)+f(B). So I just add the two stored hash values to get the hash value of the sum. I have already made sure to keep a spare vector around so that constructing a key does not require memory allocation but the code for adding the vectors is still taking a significant amount of time on its own.

Upvotes: 15

Views: 3063

Answers (1)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726499

You cannot avoid constructing a key, but you can avoid constructing the entire key.

For example, let's say that you have a key class VectorKey that encapsulates an std::vector, and caches the computed hash code. Further suppose that you provide implementations of Hash and KeyEqual that access the cached hash code off your VectorKey, and compare encapsulated vectors for equality. You can define a constructor of VectorKey that always constructs an empty std::vector, and sets the cached hash code to a value passed to the constructor:

class VectorKey{
    int cached_hash;
    std::vector<int> key;
public:
    VectorKey(const std::vector<int>& _key)
    :    key(_key)
    ,    cached_hash(calc_hash(_key)) {
    }
    // *** This is the centerpiece of the solution: *** 
    // *** this constructor effectively lets you access *** 
    // *** a bucket with nothing more than a hash code. *** 
    VectorKey(int hash)
    :    cached_hash(hash) {
    }
    // More code goes here for getting cached_hash
    // and also for checking equality
private:
    int calc_hash(const std::vector<int>& _key) {
         // calculate the hash code based on the vector
    }
};

With a key class like that, you can quickly find buckets by constructing a fake key:

size_type bucketIndex = myHashMap.bucket(VectorKey(precalculated_hash));

Upvotes: 11

Related Questions