Reputation: 41
The question is from the The Algorithm Design Manual. I have been working on it but haven't found a method to arrive at the right answer.
Question: We seek to sort a sequence S of n integers with many duplications, such that the number of distinct integers in S is O(log n). Give an O(n log log n) worst-case time algorithm to sort such sequences.
I think maybe can first pick all these distinct elements and form an array of logn length and then record the frequency and sort it. However the my first step seems to blow up running time too much...Is any superior selection method or is my method totally wrong? Thanks
Upvotes: 3
Views: 3649
Reputation: 52546
Create an array containing pairs of (unique numbers, count). The array is initially empty and kept sorted.
For each number in your original array, look the number up in the sorted array using binary search. Since the array has size O (log N), the binary search each time takes O (log log N), you do that N times, total O (N log log N). When found, you increase the count.
When not found, you insert the new number with a count of 1. This operation only happens O (log N) times, and is trivially done in O (log N) steps, for a total of O (log^2 N), which is much smaller than O (N log log N).
When you are done, fill the original array with the required numbers. That takes O (N).
There's really no need to create a balanced sorted tree to make the insertions faster, because the set of unique numbers is so small.
If the set of integers is all contained in a range X ≤ number ≤ Y, then the problem can be solved in O (max (N, Y - X + 1)) using an array of X - Y + 1 counters and not even bothering to find unique numbers. The technique is reportedly used to great effect in Iain Banks' book "Player of Games".
Upvotes: 4
Reputation: 449
Use a balanced binary tree to calculate the number of occurrences of each number. Since there are only log N distinct numbers, the size of the tree is log N, and thus all operations is performed in log log N. (this is exactly how map<> is implemented is C++)
Then, just iterate the nodes of the tree in a pre-order traversal, and print each integer the required number of times in this order.
Upvotes: 6