Reputation: 566
For example we have the string "abcdabcd"
And we want to count all the pairs (e.g: "ab" or "da") that are available in the string.
So how do we do that in apache spark?
I asked this cause it looks like that the RDD does not support sliding function:
rdd.sliding(2).toList
//Count number of pairs in list
//Returns syntax error on first line (sliding)
Upvotes: 1
Views: 760
Reputation: 16308
Apparently it supports sliding
via mllib as shown by zero323 here
import org.apache.spark.mllib.rdd.RDDFunctions._
val str = "abcdabcd"
val rdd = sc.parallelize(str)
rdd.sliding(2).map(_.mkString).toLocalIterator.forEach(println)
will show
ab
bc
cd
da
ab
bc
cd
Upvotes: 5