Reputation: 58201
If a hash table holds N distinct items, and is not overloaded, then the hashes for the N items must have have approximately lg(N) bits, otherwise too many items will get the same hash value.
But a hash table lookup is usually said to take O(1) time on average.
It's not possible to generate lg(N) bits in O(1) time, so the standard results for the complexity of hash tables are wrong.
What's wrong with my reasoning?
Upvotes: 2
Views: 1250
Reputation: 59144
When we state the complexity of hash table insertion, we're not talking about some abstract concept of hash table insertion. We're talking about a specific algorithm for hash table insertion or, more accurately, a class of algorithms that differ only in irrelevant details.
A "hash table insertion" algorithm that generates variable length hashes according to the number of elements in the table, and which therefore takes more time to hash when there are more elements, would be unusual. Unless explicitly stated, that is not the kind of algorithm we're talking about when we give the complexity of "hash table insertion".
Now, a normal "hash table insertion" algorithm is specified for a type of computer called a RAM (random access machine), which is a type of computer that is like a normal computer in terms of time complexity. This makes our statements of complexity useful. Unlike a normal computer, though, a RAM has no specified memory limit. It is still assumed, however, that address-sized numbers can be manipulated in constant time, and this is a feature of real computers as well.
In a normal hash table implementation, the size of a hash is a bounded multiple of the computer's address size, and so you are just mistaken when you say "It's not possible to generate log(N) bits in O(1) time". It is possible, and normal hashing algorithms take the same amount of time no matter how many elements are in the table. (They do depend on the length of keys, though).
Upvotes: 0
Reputation: 530843
The analysis is based on the assumption that the hash function is fixed and not related to the actual number of elements stored in the table. Rather than saying that the hash function returns a lg N-bit value if there are N elements in the hash table, the analysis is based on a hash function that returns, say, a k-bit value, where k is independent of N. Typical value of k (such as 32 or 64) provide for a hash table far larger than anything you need in practice.
So in once sense, yes, a table holding N elements requires a hash function that returns O(lg n) bits; but in practice, a constant that is far larger than the anticipated maximum value of lg n is used.
Upvotes: 0
Reputation: 234354
What is wrong with your reasoning is the use of conflicting definitions of "time".
When one says that lookup in a hash table takes O(1) time, one usually means that it takes O(1) comparisons, that is, the number of comparisons required to find an item is bounded above by a constant. Under this idea of "time", the actual time (as in the thing you would measure in seconds) used to compute the hash causes no variation.
Measuring time in comparisons is an approximation that, while it may not reflect reality in the same way that measuring it in seconds would, still provides useful information about the behaviour of the hash table.
This sort of thing is true for most asymptotic complexity descriptions of algorithms: people often use "time" with a very abstract meaning that isn't the informal meaning of "time", but more often than not is some variation of "number of operations" (with the kind of operation often left unstated, expected to be obvious, or clear from context).
Upvotes: 11
Reputation: 549
Hashtable search is O(1). I think you are mixing insertion(which is O(n)) and search.
Upvotes: -1