Reputation: 662
From the Spark example (https://spark.apache.org/examples.html) , the code looks like:
val file = spark.textFile("hdfs://...")
val counts = file.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)
And works when compiled. However, if i try this exact code at the Spark REPL:
scala> val lines = "abc def"
lines: String = abc def
scala> val words = lines.flatMap(_.split(" "))
<console>:12: error: **value split is not a member of Char**
val words = lines.flatMap(_.split(" "))
^
What gives??
thanks Matt
Upvotes: 4
Views: 10872
Reputation: 67075
lines
is just a string. So flatmap
is being run against a sequence of characters. You need to use an RDD
val rddYouCanUse = sc.parallelize(List("abc def"))
Upvotes: 2
Reputation: 43
file
is probably an Iterator[String]
or something like that. In your code, lines
is just a String
. No iterator. That means when you flatMap the String
, you're treating the String
as the collection, so each element is a Char
. And Char
doesn't have split
as a method (doesn't make sense).
If you break it out a little..
val words = lines.flatMap(x => x.split(" "))
^ this is a Char
You can just split on the string itself.
val words = lines.split(" ")
Upvotes: 2