Reputation: 493
My goal is to implement a rehash() method for a hash table I am writing in a C++ template class. However, I am confused what exactly I can hash on since this method only knows the type as a generic. As far as I know, this differs from Java generics
My understanding of java... In Java, since every object inherits the hashCode() method, we can override this method. If writing a generic hash table in java, then during the rehash(), the hashCode() method can simply be called to return a value. That can be modded by tableSize. Painless
Back to C++... Since I do not believe I can simply call a hashCode() method on my C++ generics nor access a generic type's data members, I don't know what I could rehash on.
Is it possible for my generic type in C++ to inherit an abstract class somewhow so I can force a virtual hashcode()=0 method? Or do the hash functions in template classes rely some other kind of non-datamember hashing?
Tried to be as clear as possible. Thanks to anyone that can point me in the right direction or offer some guidance.
Upvotes: 1
Views: 2603
Reputation: 14184
NOTE: Of course, I suppose that your goal to implement a hashtable is for learning purposes. If not, use std::unordered_map
.
The most important point here is what you have noticed yet: Templates and generics are not the same thing. (Well, and java generics are only a fancy syntactic sugar tool, thanks to type-erasure, but that its other story).
C++ templates works writting a version of a class for every different instantation of the template. That is, if in your program you use std::vector<int>
and std::vector<bool>
, the compiler generate code for class vector with int as type, and class vector with bool as type.
One of the most powerfull aspects of templates is that every type-related operation is evaluated at compile-time. That is, every template instantation, alias, typedef, etc, is done at compile time. The code that you get at runtime its a compound of the final generated classes for the different template instances. So you don't think about types at runtime, like in a OO language like java or C#, you think about compile-time. Everithing must be resolved when compilation ends.
Now we look at your problem:
You want an "hashable" "interface" for your types, to call to a hash function within the hashtable.
This can be done in two ways:
One way to solve your problem is to suppose that your types have a hash()
public member function (Like getHashCode()
in java, that is inherited from Object).
So, in your hashtable implementation, you use this hash function as if the element has it. Don't worry about the type passed.
The fact is if you do this, and you pass a type as template argumment that has not a public hash member function, the template cannot be instantiated. Violá!
Think about templates as contracts: You write a completely generic code in the template. The types passed to the template must fullfill the contract. That is, must have everything that you have supposed they have.
The problem with this approach is that any type used in the hashmap must have a hash public member function. Note that you cannot use basic types in the hashmap, with this way.
A functor is a class that acts as a function. That is, an instance of that class can be used as it was a function. For example:
template<typename T>
struct add
{
T operator()(const T& a , const T& b)
{
return a + b;
}
};
You could use this functor as follows:
int main()
{
add<int> adder;
int a = 1 , b = 2;
int c = adder(a,b);
}
But the most important use of functors is based in that functors are instances of classes, so can be passed to other sites as argumments. That is, a functor acts as a high-level function pointer.
This is used in generic STL algorithms like std::find_if
: Find if is used to find an element of a specified interval, based on a search criteria. That criteria is passed with a functor that acts as a boolean predicate. For example:
class my_search_criteria
{
bool operator()(int element)
{
return element == 0;
}
};
int main()
{
std::vector<int> integers = { 5 , 4 , 3 , 2 , 1 , 0 };
int search_result = std::find_if( std::begin( integers ) , std::end( integers ) , my_search_criteria() );
}
But, how functors could help with your problem?
You could immplement a generic functor that acts as hash function:
template<typename T>
struct hash
{
unsigned int operator()(const T& element)
{
return /* hash implementation */
}
};
and use it in your hashtable class:
template<typename T>
class hachtable
{
private:
hash<T> hash_function;
std::vector<T> _container;
void add(const T& element)
{
_container.insert(std::begin( _container ) + hash_function( element ) , element);
}
};
Note that you need that hash was implemented for the type of the element.
C++ templates allows you to write special explicit cases of your template. For example, you write a generic array class, and you noticed that if the type of the elements are boolean, it could be more efficient if you store the booleans as bits of numbers, to reduce memory consumption. With C++ templates you could write that special cases. You write explicitly the template class with a explicit type as template argumment. This is known as "template specialization". In fact, that "use bits for boolean case" example is exactly what std::vector
does.
In our case, if we have a declaration of a hash functor:
template<typename T>
struct hash;
We need a specialization for every type that you will use in the hashmap. For example, a specialization for unsigned ints:
template<>
struct hash<unsigned int>
{
unsigned int operator()(unsigned int element)
{
return element;
}
};
The functor-based way is exactly what the C++ standard libray does. It has a definition of a hash functor, std::hash
, and uses it in hashtable implementations, like std::unordered_map
. Note that the libray has a set of built-in hash specializations for the basic types.
Upvotes: 1