Reputation: 4863
I was trying some hashing algorithms with stdext::hash_value()
on VS2010 and realized this:
#include <iostream>
#include <xhash>
using namespace std;
int main()
{
#ifdef _WIN64
std::cout << "x64" << std::endl;
#else
std::cout << "x32" << std::endl;
#endif
std::cout << stdext::hash_value(345.34533) << std::endl;
std::cout << stdext::hash_value(345.566) << std::endl;
return 0;
}
// Output is:
// x64
//3735928758
//3735928758
I tried some other couples of double variable that has the same integer but different fractional part. Like 1.234 vs 1.568. Hash values were always the same. So I took a look at source of hash_value()
and saw
#define _HASH_SEED (size_t)0xdeadbeef
template<class _Kty> inline
size_t hash_value(const _Kty& _Keyval)
{ // hash _Keyval to size_t value one-to-one
return ((size_t)_Keyval ^ _HASH_SEED);
}
_KeyVal
is cast to size_t
which does not make sense to me. The function simply ignores the fractional part of double. What is the logic behind this implementation? Is it a bug or feature?
Upvotes: 3
Views: 2699
Reputation: 146910
VC10 contains the C++0x Standard hashing mechanisms, so there's no need to use the stdext
ones anymore, and std::hash
does contain a mechanism for double
which performs a bitwise conversion and then hashes. That code that you have for stdext
is just a fallback mechanism, and it's not really intended for use with floating-point types. I guess it's a design oversight.
template<>
class hash<double>
: public unary_function<double, size_t>
{ // hash functor
public:
typedef double _Kty;
typedef _ULonglong _Inttype; // use first 2*32 bits
size_t operator()(const _Kty& _Keyval) const
{ // hash _Keyval to size_t value by pseudorandomizing transform
_Inttype _Bits = *(_Inttype *)&_Keyval;
return (hash<_Inttype>()(
(_Bits & (_ULLONG_MAX >> 1)) == 0 ? 0 : _Bits));
}
};
Upvotes: 0
Reputation: 1518
stdext::hash_value isn't a hash function. It's the input to a hash function, you specialize it for your type so that it can be used as a key for the stdext hash classes. There doesn't seem to be any documentation for it however. The actual hash function is stdext::hash_compare.
But because there is no default specialization of hash_value it uses the convert-to-int method which ignores the fractional part.
There is an almost-identical bug for the standard std::tr1::hash/std::hash function up until vc10:
in vc10 std::hash gets a double specialization that hashes the bits. I guess stdext is considered obsolete now so there's no fix for it even in vc10.
Upvotes: 2
Reputation: 153899
This is, apparently, an attempt to provide a generic hash function for integers (although I don't see what the xor adds). It quite clearly won't work for most other types. Including floating point.
Providing a good hash function for a floating point value is difficult; if I were trying to make a generic hash, I'd probably start by testing for 0, NaN and Inf, and returning predefined hashes for those (or rejecting NaN completely as it is not a valid hashable value), and then simply using a standard string hash on the underlying bytes. This will at least make hashing compatible with the == operator. But the problems of precision mean that == itself may not be what is needed. Nor <, in the case of std::map, since std::map uses < to define an equality relationship, and depending on the source of the floats or doubles, such an equality relationship might not be appropriate for a hash table.
At any rate, don't expect to find a standard hash function for the floating point types.
Upvotes: 0
Reputation: 92211
That's old code - doesn't seem very good.
You should probably try std::hash instead.
Upvotes: 0
Reputation: 754525
The function is written to work with any type of data. It makes no assumption about the size and hence is inefficient for certain types. You can override this behavior for doubles to make it more efficient via a template specialization
template<>
size_t hash_value<double>(const double& key) {
return // Some awesome double hashing algorithm
}
Putting this definition above your main
method will cause the calls to stdext::hash_value(354.566)
to bind to this definition as opposed to the standard one
Upvotes: 1