springhibernatetutes
springhibernatetutes

Reputation: 91

Finding sum of values in RDD

I have a sample file where I have numbers in it separated by spaces.I need to find sum of those numbers.

here is the file it looks like :

10 20 30 40 50 60 70
1 2 3 4 5 6 7 8 9 10
10 20 30 40 50 60 70

I tried loading the file using textfile which has numbers and then I applied flatmap to split the numbers by spaces and then I am unable to find sum function to do sum of elements.

here is the code:

val rdd=sc.textFile(“/tmp/numbers.txt”)

val numRdd=rdd.flatMap(lines=>lines.split(“ “))

Upvotes: 1

Views: 1992

Answers (1)

Rishu S
Rishu S

Reputation: 3968

You could try to map the flatmap to a map function to convert it to Int and then use sum() on the RDD.

val data = Seq("10 20 30 40 50 60 70 1 2 3 4 5 6 7 8 9 10 10 20 30 40 50 60 70")
val rdd:RDD[String] = sc.parallelize(data)
val dataSplit = rdd.flatMap(x => x.split(" ")).map(x => x.toInt)
val sumData = dataSplit.sum()
println("Total sum "+sumData)

Upvotes: 2

Related Questions