Chang
Chang

Reputation: 4133

Hash function - two different meanings?

when we say hash function, I find it means converting a sequence bytes of keys to 32bit or 64bit unsigned integer in most articles,for example see this

However, when you implement hash_table, it looks like that hash function means converting a very large integer to smaller internal array index, and in this domain, the meaning of "hash function" mentioned above is changed to hash value of keys.

  1. Is my understanding right?
  2. Can someone provide some insights or links or papers on the converting large integer to smaller internal array index?

Thanks

Upvotes: 1

Views: 173

Answers (3)

comingstorm
comingstorm

Reputation: 26087

The term "hashing" generally covers both of the above meanings; as other answers point out, the operations are similar. Also, the two processes are generally used in tandem -- one is not really useful without the other.

When seeking out or designing a hashing system, the fiddly part is generating a well-distributed 32/64 bit integer (the actual "hash function"). Once you have a good initial hash value, the exact way you use its output is not critical, as long as the result is fairly evenly distributed across your final indices. (This kind of functional division allows you to update the algorithm/datastructure independently of the hash function.)

The obvious way to generate a final index (suitable for a fixed-size hash table) is to take the hash value modulo the number of indices. However, the way the hash value is used depends on the application (e.g., a dynamic-size hash table will probably do something different from a fixed-size table).

Upvotes: 1

Patrick87
Patrick87

Reputation: 28292

My understanding of "hash function" is this: any function from a set A to a set {0, 1, 2, ..., n}, where n is a non-negative natural number. Nothing else is inherently part of what it means to be a "hash function". Both of your examples - and many other examples - consistute "hash functions", since they map things to a subset of non-negative integers. The way in which a "hash function" is applied to a problem is also not part of the definition.

I don't even really think the domain has to be bigger than the codomain, but I might be wrong. I don't think the codomain can be infinite, but I may be wrong.

Upvotes: 1

Poindexter
Poindexter

Reputation: 2425

A hash function is simply a mapping from a large data set to a smaller data set. In the case of a hash table, that smaller set of data (often integers as you point out) is used as the lookup keys for the buckets.

Per your example article, all of the integers that all those hash functions output would then be used as the look up indices for the hash table.

Upvotes: 0

Related Questions