Reputation: 2572
looking at Wikipedia for Hash tables, it says that inserting and searching is O(1). But my concern is, that my teacher told me that only the lookup is O(1) and that hashing is O(s), where s the length of the string. Shouldn't the inserting and search be O(s) instead. Where it says hashing(s) + lookup(s)= O(hashing(s) + lookup(s))= O(s).
Could anyone explain me what is the correct way of writing the time complexity in big O notation for hash tables, and why? If assuming it is perfect hashing and no collisions occur.
Upvotes: 0
Views: 1224
Reputation: 59174
Hash tables are used for more than just strings. The O(1) complexities for insert and lookup are for hash tables in general and only count the known operations.
Hashing and comparison are counted as O(1), because something must always be done for those, even if you're just storing integers, but we don't know what that is.
If you use a hash table for some data type (like strings) that multiplies the cost of those operations then it will multiply the complexity.
It is actually very important to consider this when measuring the complexity of a concrete algorithm that uses hash tables. Many of the string-based algorithms on this site, for example, are given complexities based on the assumption that the length of input strings is bounded by some constant. Thankfully that is usually the case.
Upvotes: 2
Reputation: 58241
This question is very similar to a question I asked: Is a lookup in a hash table O(1)?
The accepted answer was that for hashtables, "time" is measured in comparisons, and not operations. Here's the full answer, quoted:
What is wrong with your reasoning is the use of conflicting definitions of "time".
When one says that lookup in a hash table takes O(1) time, one usually means that it takes O(1) comparisons, that is, the number of comparisons required to find an item is bounded above by a constant. Under this idea of "time", the actual time (as in the thing you would measure in seconds) used to compute the hash causes no variation.
Measuring time in comparisons is an approximation that, while it may not reflect reality in the same way that measuring it in seconds would, still provides useful information about the behaviour of the hash table.
This sort of thing is true for most asymptotic complexity descriptions of algorithms: people often use "time" with a very abstract meaning that isn't the informal meaning of "time", but more often than not is some variation of "number of operations" (with the kind of operation often left unstated, expected to be obvious, or clear from context).
Upvotes: 0