Reputation: 173
I am working on an algorithm to find the first duplicate element in an array of numbers. I understand to do this i should sort my array (using merge sort in O(nlogn)) and then iterate through the array while performing a binary search on the elements (O(logn)). So the total runtime of my algorithm is O(nlogn)+O(logn)= O(nlogn). My question is what am I missing to make it O(nlog²n). I do not understand why the log would be squared.
The problem statement:
You receive a list of integers, one at a time, a1,a2,... and I want you to find the first time an ai comes up that has occurred previously, i.e. find the smallest i such that all the a1,a2,...,ai-1 are distinct but ai=aj for some
1≤j<i
. Describe an O(i log2 i) time algorithm to find the smallest i. ( Give an algorithm in recursive form and show its running time using the Master Theorem. )
Upvotes: 1
Views: 588
Reputation: 26117
Finding a repeated element should only take O(n log n) time.
Once you have sorted your array, your repeated elements will all be next to each other; you shouldn't need to perform a binary search.
Edit in response to the edited question:
If, instead of a batch task, you need an online streaming algorithm, then you can use a balanced search tree of elements already encountered. This should provide O(i log i) performance over the first i elements.
Upvotes: 2
Reputation: 178531
Your algorithm is incorrect, as it runs slower than the requirements of the actual question.
The question specifically asks to run in O(i logi)
time, where i
being the first dupe. If you do your algorirthm when stream of numbers is done, you are too slow (the stream could also be infinite!) If you repeat your algorithm for each new number, it will run too slow (1log1 + 2log2 + 3log3 + ... + ilogi
which is not in O(ilogi)
).
You could do so by maintaining a set (based on a self balancing BST), and when a new element comes:
check if it is in the set
if it is:
abort - found the first dupe
otherwise:
add it.
Also note, the requirement (as you gave it) is O(i log_2i)
, and not O(i*log^2(i))
Upvotes: 3
Reputation: 4410
Use a BST as follows:
- Initialize a Binary Search Tree
- For each new element that comes along, insert it in the tree and update the tree
- Whenever you find an element that was already present in the BST, => you have found the smallest i index
The complexity for updating the tree is O(kLogk) when inserting the k'th element.
Upvotes: 3