Reputation: 21
I need some ideas to develop a good hashing function for my assignment. I have a list of all the countries in the world (around 190) in total. The names of each country is the key for the hashing function. Is there a specific kind of hashing function anyone would recommend to store this data in a hashing function without many collisions? Also, can you perhaps give an example of how to implement it?
Upvotes: 2
Views: 234
Reputation: 5917
The simplest approach I can think of is for each country's name to compute the sum of the ASCII values in its representation and use this as the hash value:
int hash(const char *s)
{
int h = 0;
while (s && *s)
h += *s++;
return h;
}
If your hash map has size N, you store country names with map[hash(my_country) % N] = my_country
. Conceptually.
Just try this approach and see whether the resulting hash values are sufficiently uniformly distributed. Note that the quality of the distribution may also depend on N.
Upvotes: 0
Reputation: 27470
You can use generated perfect hash for that (GNU perf).
Of if the set of strings is dynamic then you can use ternary trie. For N unique strings it will give you unique number [1..N]. For your case it will be faster than with hash tables. Here is my implementation of such thing: http://code.google.com/p/tiscript/source/browse/trunk/tool/tl_ternary_tree.h
Upvotes: 2
Reputation: 249394
Use GNU gperf. For inputs like yours, it will generate C code for you which implements a perfect hash function (for the given inputs). No collisions, no worries.
Upvotes: 2