Gung Foo
Gung Foo

Reputation: 13558

How to select a good hashing function (for a hashtable)

I was wondering how one would best approach the task of deciding upon the operations a hashing function should perform on it's input, based on the probable input format of course.

Are there any rule(book)s i have yet to find?

How could i estimate the cost of such a function?

Can i somehow foresee the likelihood of collisions knowing the charset used for inputs?

Thanks for your food for my thought in advance. :)

Upvotes: 1

Views: 718

Answers (2)

Georgi
Georgi

Reputation: 148

...

Hi Gung Foo,

just take a look at CRC32 vs FNV1A_Yorikke face-off at:

http://www.sanmayce.com/Fastest_Hash/index.html#KT_torture3

How could i estimate the cost of such a function?

In short: heavy & versatile keys/loads. Generally a hash (table-look-up) function has three major aspects to consider:

  • Collisions both dispersion and MAX depthness of the fattest slot;

  • Warm-up time i.e. starting cost/overhead;

  • Linear speed.

Upvotes: 1

kolossus
kolossus

Reputation: 20691

The general rule of thumb with hashcode generation is that the resulting value be as unique as possible. Two things that are desirable in a hashcode/hashfunction

  1. The hashcode is desired to be as unique (and as small) as possible. That being said,(in an ideal world) using a data member whose datatype is small on footprint and that can be guaranteed to be unique for any instance of the value is a fast an efficient way to arrive at a hashcode. This sometimes is however, not a safe practice.
  2. The hash function should be perfect i.e. should be able to generate a unique value, all values being generated within a small range.

Upvotes: 0

Related Questions