Ahmad
Ahmad

Reputation: 9658

The kth smallest number in two arrays, one sorted the other unsorted

There is already an answer for two sorted arrays. However, in my question one of the arrays is unsorted.

Suppose X[1..n] and Y[1..m] where n < m. X is sorted and Y is unsorted. What is the efficient algorithm to find the kth smallest number of X U Y.

MinHeap can be used to find the kth smallest number in an unsorted array. However, here one of these arrays is sorted. I can think of:

 1. Building a `MinHeap` for `Y`
 2. i = 1, j = 1
 3. x1 = extract Min from Y
 4. x2 = X[i]; 
 5. if j == k: return min(x1, x2)
 5. if x1 < x2: j++; goto 3
 6. else: j++; i++; goto 4 

Is it efficient and correct?

Upvotes: 2

Views: 306

Answers (2)

Jim Mischel
Jim Mischel

Reputation: 133975

Make a max-heap of the smallest k items from the sorted array (X). That takes O(k) time.

For each item in the unsorted array (Y), if it's smaller than the largest item in the heap (the root), then remove the root from the heap and add the new item. Worst case for this is O(m log k).

When you're done, the kth smallest number will be at the top of the heap.

Whereas the worst case for the second part is O(m log k), average case is much better because typically a small percentage of items have to be inserted in the heap.

Upvotes: 1

btilly
btilly

Reputation: 46399

There is no help for it but you have to scan Y. This takes O(m) so you can't do better than O(m).

However quickselect has average performance O(m). Basically that algorithm is just to do a quicksort, except that you ignore all partitions that don't have your final answer in them.

Given that n < m we can simply join one array to the other and do quickselect.

Note that the average performance is good, but the worst case performance is quadratic. To fix that, if you do not make progress sufficiently quickly you can switch over to the same median of medians algorithm that gives quicksort guaranteed performance (albeit with bad constants). If you're not familiar with it, that's the one where you divide the array into groups of 5, find the median of each group, and then repeat until you are down to 1 element. Then use that element as the pivot for the whole array.

Upvotes: 1

Related Questions