Reputation: 837
As a part of the project I am currently working on, I need to using several relatively short strings (e.g. "ABCD1234") as keys for a custom container. The problem is, the objects in this container are of type whose "primary key", so to speak, is numeric. So I need to take the unique strings given to me, translate them into something numeric, and make sure I preserve the uniqueness.
I've been trying to use boost::hash
, and while I think it's going to work, I am annoyed by how big the hash value ends up being, especially considering that I know I am going to start of with short strings.
Is there another library, native or third-party, I could maybe use? This is obviously a convenience thing, so I am not too worried about it, but figured I might as well ask.
Upvotes: 3
Views: 542
Reputation: 837
Turns out neither solution is viable for me. I'll just have to work with size_t
s. Thanks though.
Upvotes: 0
Reputation: 392833
You can use proper cryptographically strong hashes (digests).
These have the nice property that they can be truncated without removing their random distribution properties (this is NOT the case with general purpose hash values and also not of UUIDs).
While say a raw SHA-1 is much longer (160bit) and also not nearly as fast, you can truncate it too much smaller values as long as you can provide a usefully small collision probability.
This is the approach that Darcs, Mercurial, Git etc. take with their commit identifiers.
Note for speed, SHA-2 is faster and results in a 512 bits digest, so there's a special approach known as SHA-512/64 eg. to truncate SHA-2's 512 bits into a 64 digest. Also, you could look at faster hashes such as BLAKE or BLAKE2.
If you're looking for a perfect hash for known strings, here's an old answer of mine that gives a complete example of this:
Upvotes: 1
Reputation: 1768
you could write your own that returns a short, but that's going to be prone to collisions.
Here is one I adapted to return a short/16 bits. Might need some tweaking.
unsigned short hash( std::string const& s ) {
short results = 3;
for ( auto current = s.begin(); current != s.end(); ++ current ) {
unsigned char c = static_cast<unsigned char>( *current );
results = results + ((results) << 5) + *(c + i) + ((*(c + i)) << 7);
i++;
}
return ((results) ^ (results >> 16)) & 0xffff;
}
Also, if you know what your keys are ahead of time and there aren't a lot of them, you could look into a perfect hash
Upvotes: 1