Zach
Zach

Reputation: 51

Use Scala to Find The Most Common 'uncommon' word in keys

I need to find the most common 'uncommon' word in a text file. I have a list of common words and my map of the most common words in the file.

Let's say I have

val commonWords = List("the","a","I","is")

and map

val mostUsedWordsFromTextFile

How might I loop over the map mostUsedWordsFromTextFile until I hit a word not in list commonWords?

Upvotes: 1

Views: 345

Answers (1)

Xavier Guihot
Xavier Guihot

Reputation: 61666

Assuming your input is:

val input = RDD(("hello", 4), ("the", 2), ("world", 6))

then you could:

  • filter out words which are part of common words
  • take the most popular word from the remaining ones

this way:

val commonWords = Set("the", "a", "I", "is")

val result = input
  .filter { case (word, count) => !commonWords.contains(word) } // RDD(("hello", 4), ("world", 6))
  .takeOrdered(1)(Ordering[Int].on { case (word, count) => -count }) // Array(("world", 6))
  .head // ("world", 6)
  ._1 // world

See How to find max value in pair RDD? for different ways of implementing maxBy on an RDD.

Upvotes: 1

Related Questions