Surender Raja
Surender Raja

Reputation: 3599

Not able to complete the word count program in spark using scala

I am doing some basic programs in scala

I am trying to get the word count program in scala

scala> val myWords = "HI HOW HI HOW ARE"
myWords: String = HI HOW HI HOW ARE

scala> val mySplit = myWords.split(" ")
mySplit: Array[String] = Array(HI, HOW, HI, HOW, ARE)

scala> val myMap = mySplit.map(x => (x,1))
 myMap: Array[(String, Int)] = Array((HI,1), (HOW,1), (HI,1), (HOW,1), (ARE,1))

 scala> val myCount = myMap.reduceByKey((a,b) => a+b)
 <console>:16: error: value reduceByKey is not a member of Array[(String, Int)]
   val myCount = myMap.reduceByKey((a,b) => a+b)

I am not sure what does this error mean?

So I tried to find what are the methods that I can invoke with

scala> val myCount = myMap.
apply          asInstanceOf   clone          isInstanceOf   length            toString       update

Could someone explains me where I went wrong in my code.

Upvotes: 1

Views: 738

Answers (2)

Simon
Simon

Reputation: 6363

I think that your code comes from an Apache Spark example. To do wordcount in plain Scala, you can use groupBy or fold* from the Seq trait.

Edit: I see from your comment that you are indeed using spark. Then what you need to do is to turn your array into an RDD which has reduceByKey. So you use sc.paralellize to turn a Seq to an RDD. Then your code will work.

Upvotes: 3

meucaa
meucaa

Reputation: 1515

A more "classy" solution to count words:

val myWords = "HI HOW HI HOW ARE"
val mySplit = myWords.split(" ")
.foldLeft(Map.empty[String, Int]){
     (count, word) => count + (word -> (count.getOrElse(word, 0) + 1))
 }

And to answer what is wrong with your code : you are using the reduceByKeymethod which does not exist for the collection you are using.

Upvotes: 3

Related Questions