How to select a good hashing function (for a hashtable)

Question

I was wondering how one would best approach the task of deciding upon the operations a hashing function should perform on it's input, based on the probable input format of course.

Are there any rule(book)s i have yet to find?

How could i estimate the cost of such a function?

Can i somehow foresee the likelihood of collisions knowing the charset used for inputs?

Thanks for your food for my thought in advance. :)

Georgi · Accepted Answer

...

Hi Gung Foo,

just take a look at CRC32 vs FNV1A_Yorikke face-off at:

http://www.sanmayce.com/Fastest_Hash/index.html#KT_torture3

How could i estimate the cost of such a function?

In short: heavy & versatile keys/loads. Generally a hash (table-look-up) function has three major aspects to consider:

Collisions both dispersion and MAX depthness of the fattest slot;
Warm-up time i.e. starting cost/overhead;
Linear speed.

How to select a good hashing function (for a hashtable)

Answers (2)

Related Questions