user977154
user977154

Reputation: 1095

How can i count the collisions in this hash function?

This is a prefix hashing function. i want to count the number of collisions in this method but i am not sure how to do it. It seems like it might be simple but i just cant think of a great way to do it....

int HashTable_qp::preHash(string & key, int tableSize )
{
    string pad = "AA";
    //some words in the input are less than 3 letters
    //I choose to pad the string with A because all padded characters 
    //have same ascii val, which is low, and will hopefully alter the results less
    if (key.length() < 3)
    {
        key.append(pad);
    }
    return ( key[0] + 27 * key[1] + 729 * key[2] ) % tableSize;
}

Upvotes: 0

Views: 2206

Answers (3)

wildplasser
wildplasser

Reputation: 44250

create a histogram:

 unsigned histogram[tablesize] = {0};

generate some (all) possible strings and compute their hashval, and update the histogram accordingly:

for(iter=0; iter < somevalue; iter++) {
 hashval = hashfunc( string.iterate(iter) ); // I don't know c++
 histogram[hashval] +=1;
 }

Now you have to analyze the hashtable for lumps / clusters. Rule of thumb is that for (tablesize==iter), you expect about 30 % cells with count =1, and about 30 % empty; the rest has two or more.

If you sum all the (count*(count+1))/2, and divide by the tablesize, you should expect around 1.5. A bad hashfunction gives higher values, a perfect hash would only have cells with count=1 (and thus: ratio=1) With linear probing you should of course never use tablesize=niter, but make tablesize bigger, say twice as big. You can use the same metric (number of probes / number of entries), to analyse its performance, though.

UPDATE: a great introduction on hashfunctions and their performance can be found at http://www.strchr.com/hash_functions .

Upvotes: 1

Alexander Wood
Alexander Wood

Reputation: 36

If it's an array as the underlying data structure do: int hash = preHash(&key, array.length); if(array[hash] != null) this.count++; If it's an array of linked lists do:

if(array[hash] != null && *(array[hash]) != null)
this.count++

If you only have access to the stl library I believe just testing that element is null before adding it would be enough after calling the hash function.

Upvotes: 1

Brigand
Brigand

Reputation: 86260

You can create an array of integers, each representing one hash. When you're done making the hashes loop back through the array in a nested loop. If you had the following array,

[0] -> 13
[1] -> 5
[2] -> 12
[3] -> 7
[4] -> 5

For each item i in 0..n, check items i+1..n for matches. In English that would be: check if each element equals any of the elements after it.

Upvotes: 0

Related Questions