Reputation: 1212

What is the fastest algorithm to find an element with highest frequency in an array

I have two input arrays X and Y. I want to return that element of array X which occurs with highest frequency in array Y.

The naive way of doing this requires that for each element x of array X, I linearly search array Y for its number of occurrences and then return that element x which has highest frequency. Here is the pseudo algorithm:

 max_frequency = 0
 max_x = -1             // -1 indicates no element found
 For each x in X
     frequency = 0
     For each y in Y
         if y == x
             frequency++
     End For
     If frequency > max_frequency
         max_frequency = frequency
         max_x = x
     End If
 End For
 return max_x

As there are two nested loops, time complexity for this algorithm would be O(n^2). Can I do this in O(nlogn) or faster ?

Upvotes: 6

Answers (9)

Gopal k .Dwivedi

Reputation: 79

if want Find Highest Frequency from an array You can code like this ,this code is written in JavaScript but can use this method any other language also.

 // Declare an array
     let array =[1,2,3,4,5,3,3,2,2,3,3,3,3,1,1,1,4]
     //create empty object
     let obj={}
     for (let index = 0; index < array.length; index++) {
      const element = array[index];
    obj[array[index]] = obj[array[index]]? obj[array[index]]+1 :1 ;
    }
    //object with frequency of each element
     //   {1: 4,2: 3,3: 7,4: 2,5: 1} 
    console.log("object with frequency of each element is =",obj)
    let high=Object.keys(obj)[0]
    for(let key in obj){
    if(obj[high] < obj[key]){
    high =key
    }
    }
    console.log("Highest Frequency element is =" ,high) //3

Upvotes: 0

rnbguy

Reputation: 1399

Sort X and Y. Then do merge sort. Count the frequencies from Y every time it encounters with same element in X.

So complexity, O(nlogn) + O(mlogm) + O(m+n) = O(klogk) where n,m = length of X, Y; k = max(m,n)

Upvotes: 1

Leonid Volnitsky

Reputation: 9144

1st step: Sort both X and Y. Assuming their corresponding lengths are m and n, complexity of this step will be O(n log n) + O(m log m).

2nd step: count each X_i in Y and track maximum count so far. Search of X_i in sorted Y is O(log n). Total 2nd step complexity is: O(m log n)

Total complexity: O(n log n) + O(m log m) + O(m log n), or simpified: O(max(n,m) log n)

Upvotes: 2

angelatlarge

Reputation: 4150

Alternatively, if you can have additional data structures, you walk the array Y, for each number updating its frequency in a hash table. This takes O(N(Y) time. Then walk X finding which element in X has highest frequency. This takes O(N(X)) time. Overall: linear time, and since you have to look at each element of both X and Y in any implementation at least once (EDIT: This is not strictly speaking true in all cases/all implementations, as jwpat7 points out, though it true in the worst case), you can't do it any faster than that.

Upvotes: 3

David

Reputation: 583

The time complexity of common algorithms are listed below:

Algorithm     |   Best    |   Worst   |  Average
--------------+-----------+-----------+---------- 
MergeSort     | O(n lg n) | O(n lg n) | O(n lg n)  
InsertionSort |   O(n)    |  O(n^2)   |  O(n^2)  
QuickSort     | O(n lg n) |  O(n^2)   | O(n lg n)  
HeapSort      | O(n lg n) | O(n lg n) | O(n lg n)  
BinarySearch  |   O(1)    |  O(lg n)  |  O(lg n)

In general, when traversing through a list to fulfill a certain criteria, you really can't do any better than linear time. If you are required to sort the array, I would say stick with Mergesort (very dependable) to find the element with highest frequency in an array.

Note: This is under the assumption that you want to use a sorting algorithm. Otherwise, if you are allowed to use any data structure, I would go with a hashmap/hashtable type structure with constant lookup time. That way, you just match keys and update the frequency key-value pair. Hope this helps.

Upvotes: 2

sgmorrison

Reputation: 986

Your suggested approach will be O(n^2) if both lists are length n. What's more likely is that the lists can be different lengths, so the time complexity could be expressed as O(mn).

You can separate your problem into two phases: 1. Order the unique elements from Y by their frequency 2. Find the first item from this list that exists in X

As this sounds like a homework question I'll let you think about how fast you can make these individual steps. The sum of these costs will give you the overall cost of the algorithm. There are many approaches that will be cheaper than the product of the two list lengths that you currently have.

Upvotes: 1

cHao

Reputation: 86575

Use a hash table mapping keys to counts. For each element in the array, do like counts[element] = counts[element] + 1 or your language's equivalent.

At the end, loop through the mappings in the hash table and find the max.

Upvotes: 7

apomene

Reputation: 14389

Merge Sorting Based on Divide and Conquer Concept gives you O(nlogn) complexity

Upvotes: 1

rowdog

Reputation: 236

Could do a quicksort and then traverse it with a variable that counts how many of a number are in a row + what that number is. That should give you nlogn

Upvotes: 0

What is the fastest algorithm to find an element with highest frequency in an array

Answers (9)

Related Questions