Reputation: 3727
I am using a std::unordered_map. I have a hash value and a way to determine if a given candidate key is the key that I am looking for, but I do not have an actual key. I want to look up the bucket corresponding to the hash value and go through each element in that bucket to see if it is the element that I am looking for. Unfortunately, the function std::unordered_map::bucket(x) requires x to be a key. Is there really no way to get a bucket from a hash value without first constructing a key?
Details that you don't need to answer the question: I could construct the key but in the common case of no collisions this will take longer than only checking if the single candidate I have found in the bucket is the right one. I have a low load factor so there are few collisions and even for a collision the full hash value is unlikely to match, so non-matches are quickly determined not to match. I care about this because I have determined with a profiler that key construction is taking a significant amount of time - there are many lookups and each lookup requires the construction of a key.
Even more details that you really don't need to answer the question: The keys are vectors of integers and my query is for the sum of two vectors. It is faster to check if a given vector V is the sum of two vectors A and B than to sum the two vectors into a third vector C=A+B and then compare C to V. I am able to determine the hash value of A+B without calculating the actual vector A+B because I store the hash values of these vectors and my hash function f has the property that f(A+B)=f(A)+f(B). So I just add the two stored hash values to get the hash value of the sum. I have already made sure to keep a spare vector around so that constructing a key does not require memory allocation but the code for adding the vectors is still taking a significant amount of time on its own.
Upvotes: 15
Views: 3063
Reputation: 726499
You cannot avoid constructing a key, but you can avoid constructing the entire key.
For example, let's say that you have a key class VectorKey
that encapsulates an std::vector
, and caches the computed hash code. Further suppose that you provide implementations of Hash
and KeyEqual
that access the cached hash code off your VectorKey
, and compare encapsulated vectors for equality. You can define a constructor of VectorKey
that always constructs an empty std::vector
, and sets the cached hash code to a value passed to the constructor:
class VectorKey{
int cached_hash;
std::vector<int> key;
public:
VectorKey(const std::vector<int>& _key)
: key(_key)
, cached_hash(calc_hash(_key)) {
}
// *** This is the centerpiece of the solution: ***
// *** this constructor effectively lets you access ***
// *** a bucket with nothing more than a hash code. ***
VectorKey(int hash)
: cached_hash(hash) {
}
// More code goes here for getting cached_hash
// and also for checking equality
private:
int calc_hash(const std::vector<int>& _key) {
// calculate the hash code based on the vector
}
};
With a key class like that, you can quickly find buckets by constructing a fake key:
size_type bucketIndex = myHashMap.bucket(VectorKey(precalculated_hash));
Upvotes: 11