user4567366
user4567366

Reputation:

Count occurrences in an array in O(n log n) time

Given an unsorted array, A[], containing n integers, how would one create an algorithm to return the element that occurred most often?

I figure you'd need a way to count the number of times an element occurs. I can only figure out an O(n2) implementation. I know I need to use a sorting algorithm first, but if I were to use merge sort, that's already the total runtime of O(n logn)). I've only sorted the data and I can't look at the elements without further increasing the runtime.

Upvotes: 3

Views: 4250

Answers (3)

xoxocoder
xoxocoder

Reputation: 314

just take a dictionary store every number in it as key, and value as its count initially 1 if number repeated again increase that count otherwise input it in dict.

traverse through each key value in dictionary whose value is greater its key will be largest among all

solved in n+n =2n ~> n

Upvotes: 1

amit
amit

Reputation: 178471

Sort the array, then, all repeating elements are next to each other.
Simply iterate the array, while holding a maxSeen and currSeen counters, if current element equals last element, increase currSeen, otherwise - reset currSeen, and replace maxSeen if the currSeen is bigger than it.

Sorting is O(nlogn), and iterating once is O(n), so total of O(nlogn + n) = O(nlogn)

It's homework, so following this guidelines to create a code should be your task. Good Luck!


As a side note, since this is at least as hard as Element Distinctness Problem, and it cannot be done better than O(nlogn) using comparisons based algorithms.


Side note two, it can be done in O(n) time + space using hash-tables.
Create a histogram of the data, which is a hash-map, inwhich a key is an element, and the value is number of occurances. Then, the problem decays to finding maximal value in this hash table.
Building the table is O(n) average case, and finding maximum is O(n).

This however uses O(n) extra memory, and might deteriorate to O(n^2) time under worst case behavior.

Upvotes: 11

Michael Laszlo
Michael Laszlo

Reputation: 12239

You surmise correctly that it's a good idea to start by sorting the array. This costs O(n log n), as you point out. Then you can scan the array and compute the frequency of each value one by one, remembering the most frequent value as you go along. This costs O(n). The total cost is O(n log n + n), which is in O(n log n).

To see why, consider O(2n log n). This is surely in O(n log n) because 2n log n is within a constant factor of n log n. And n log n + n is bounded above by 2n log n, so it is also in O(n log n).

Upvotes: 3

Related Questions