Reputation: 51
I need to find the most common 'uncommon' word in a text file. I have a list of common words and my map of the most common words in the file.
Let's say I have
val commonWords = List("the","a","I","is")
and map
val mostUsedWordsFromTextFile
How might I loop over the map mostUsedWordsFromTextFile until I hit a word not in list commonWords?
Upvotes: 1
Views: 345
Reputation: 61666
Assuming your input is:
val input = RDD(("hello", 4), ("the", 2), ("world", 6))
then you could:
this way:
val commonWords = Set("the", "a", "I", "is")
val result = input
.filter { case (word, count) => !commonWords.contains(word) } // RDD(("hello", 4), ("world", 6))
.takeOrdered(1)(Ordering[Int].on { case (word, count) => -count }) // Array(("world", 6))
.head // ("world", 6)
._1 // world
See How to find max value in pair RDD?
for different ways of implementing maxBy
on an RDD
.
Upvotes: 1