Chaining in hash tables

When we have a hash table with chaining:

I am just wondering if maintaining the list at each key in order affects the running time for searching, inserting and deleting in the hash table?

Upvotes: 3

Answers (3)

wildplasser

Reputation: 44250

In theory: yes, since in the average case you will only have to walk half the chain to find if an item is on the chain or not.

In practice, there is probably not much difference, since the chains are typically very short, and the increased code complexity would also cost some cycles, mainly in the "insert" case.

BTW: in most cases the number of slots is considerably smaller than the "keyspace" of the hash values. If you can afford the space, storing the hash values in the chain nodes will save recomputing the hash value on every hop, and will avoid most of the final compares. This of course is a space<-->time tradeoff. As in:

struct hashnode **this;
for (this=& table[slot] ; *this; this = &(*this)->link) {
    if ((*this)->hash != the_hash) continue;
    if (compare ((*this)->payload , the_value)) continue;
    break;
 }
 /* at this point "this" points to the pointer that points to the wanted element,
    or to the NULL-pointer where it should be inserted.

    For the sorted-list example, you should instead break out of the loop
    if the compare function returns > 0, and handle that special case here.

 */

Upvotes: 2

David Souther

Reputation: 8184

Hypothetically, you've chosen your hash algorithm and map size to mitigate the number of collisions you will get in the first place. At that point, you should have a very small list (ideally one or two elements) at any position, so the extra effort of maintaining a sorted structure in the chain is most certainly more than just iterating the small number of items in that bucket.

Upvotes: 1

Mahmoud Al-Qudsi

Reputation: 29579

Yes, of course. The usually-cited O(1) for a hashtable is assuming perfect hashing - where no two items that are not the same resolve to the same hash.

In practice, that won't be the case. You'll always have (for a big enough data set) collisions. And collisions will mean more work at lookup time, regardless of whether you're using chaining or some other collision-resolution technique.

That's why it's very, very important to select a good hash function that is well designed/written, and a good match for the data you'll be using as the key for your hash table. Different types of data will hash better with different hash functions, in practice.

Upvotes: 0

Chaining in hash tables

Answers (3)

Related Questions