Saket
Saket

Reputation: 46147

Find word with maximum number of occurrences

What is the most optimal way (algorithm) to search for the word that has the maximum number of occurrences in a document?

Upvotes: 2

Views: 2367

Answers (2)

amit
amit

Reputation: 178511

Finding the word that occures most times in a document can be done in O(n) by a simple histogram [hash based]:

histogram <- new map<String,int>
for each word in document: 
   if word in histogram:
      histogram[word] <- histogram[word] + 1
   else:
      histogram[word] <- 1
max <- 0
maxWord<- ""
for each word in histogram:
  if histogram[word] > max:
     max <- histogram[word]
     maxWord <- word
return maxWord

This is O(n) solution, and since the problem is clearly Omega(n) problem, it is optimal in terms of big O notation.

Upvotes: 2

NPE
NPE

Reputation: 500873

  1. Scan the document once, keeping a count of how many times you have seen every unique word (perhaps using a hashtable or a tree to do this).
  2. While performing step 1, keep track of the word that has the highest count of all words seen so far.

Upvotes: 2

Related Questions