Reputation: 215
Goal: Check if there's a duplicate number in an array with the size of
n
.
Basically if we may use an hash-table (open-hash, with linked list), then we could iterate the array and insert the numbers to the table with some value (could be 1
, doesn't really matter).
While iterating, if the cell isn't empty then we have a duplicate number.
Since we know that the expected time for read/write is O(1)
then the expected time for the algorithm is O(n)
.
Question #1: Why is the worst-case equal O(nlogn)
?
Question #2: Would you do it differently then the solution suggested?
Upvotes: 0
Views: 2973
Reputation: 178481
In here, I assume the author referred to a variant of hash table, where in each "bin" there is a BST (or some other deterministic DS), and thus in the worst case, all elements are inserted to the same bin repeatidly - and that requires O(nlogn)
overall.
However, hash tables are seldom implemented this way, because this worst case is very unlikely, and a regular linked list is implemented in this implementation, for this case - the worst case will be O(n^2)
for this solution.
The other alternative to approach this problem is sort, and iterate to find duplicates (easy in sorted arrays), this is O(nlogn)
with significantly less memory usage.
This problem is known as the element distinctness problem, and these two options (with some variants maybe) are the ways to solve it.
It is known to be Omega(nlogn)
without using extra memory and hashing.
Upvotes: 1