Fastest algorithm to find 2nd biggest element in Max Heap (with duplicates)

Question

If you have a max heap containing n integers, what would be the most efficient way to find the second largest element? The heap can contain duplicates, so the heap with n-1 max values and 1 other value would return the other value

For example, a heap containing the numbers:

4,4,4,4,4,4,4,3,4

would return the value 3.

Is there a way of doing this in faster than O(n) runtime?

trincot · Accepted Answer

There is no way to do this with better time complexity than O(n). With the example data you give (4,4,4,4,4,4,4,3,4) the heap could for example be one of these two:

             4                      4
           /   \                  /   \
         4       4              4       4
        / \     / \            / \     / \
       4   4   4   4          4   4   3   4
      / \                    / \
     4   3                  4   4

... the 3 could be in any leaf node, as this depends on the order of insertion. When you start a traversal from the root, there is no way to know whether the 3 is at the left or the right.

If you are open to use a slightly alternative data structure, then it can be done in O(1):

Store unique values in the heap. Use a hashmap to store information about the value you add. In the simple case this "information" could be an occurrence counter. So the next time you want to insert the same value in the structure, you would detect it was already in the hashmap and only increment the corresponding occurrence counter and not touch the heap.

For the above example the data structure would be as follows:

    heap              hashmap
                    key | value (=frequency)
       4           -----+-------------------
      /              4  |  8
     3               3  |  1

In case your data elements are complex structures combining a key with some related data (properties), then you would still only store the key in the heap without duplicates. The hashmap would then not give a counter for each key, but an array of actual data elements that share that same key.

So to be clear, the implementation of operations like insertion, deletion and look up would have to be customised. Here is some pseudo code assuming the existence of two variables heap and hashmap which have the corresponding behaviour:

function insert(element):
    key = element.key
    if key not in hashmap:
        hashmap.set(key, new Array)
        heap.insert(key)
    arr = hashmap.get(key) # get a reference to the array
    arr.append(element) # add element to array, affecting the hashmap-stored value

function extract(): # remove max element
    if heap.size() == 0:
        return # No more data
    key = heap.peek() # look at root value
    arr = hashmap.get(key) # get a reference to the array
    element = arr.pop() # extract from array, affecting the hashmap-stored value
    if arr.length() == 0:
        heap.extract()
        hashmap.delete(key)
    return element

function peek(): # return element with max value
    if heap.size() == 0:
        return # No more data
    key = heap.peek() # look at root value
    arr = hashmap.get(key)
    element = arr[-1] # last element in array
    return element

You can get the greatest value that is less than the max value as follows:

key = max(heap.root.children())

... and then depending on what you expect as return value, you could also fetch a corresponding data element from the hashmap, or even all of them (when there are duplicates).

Fastest algorithm to find 2nd biggest element in Max Heap (with duplicates)

Answers (1)

Related Questions