Daksh Agarwal
Daksh Agarwal

Reputation: 167

ArrayBuffer not persisting values outside loop in scala

I am trying to convert rdd to 2-D array. I am using below code for this -

import scala.collection.mutable.ArrayBuffer
var temp=new ArrayBuffer[ArrayBuffer[_>:Double]]
    f.foreach(x=> {
    temp:+= ArrayBuffer(x(0),x(1),x(2),x(3),x(4))
    println(temp)
})
println(temp)

Here f is my rdd. println statement inside loop is working correctly. But when the outside println executes, it does not show anything. Can someone please explain why this is happening ? Thanks in advance.

Upvotes: 1

Views: 439

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

As you haven't provide f implementation I am guessing it as RDD[Array].

RDD are distributed in nature. When we apply a function such as map, foreach, reduce etc. on an RDD they are executed in distributed manner i.e. since RDD is already distributed the foreach function was also carried out in distributed manner on executor nodes. And since temp is pointing to the ArrayBuffer created on the driver node, the distribute execution couldn't update the ArrayBuffer pointed by temp.

The correct solution would be to collect f before applying foreach function as

import scala.collection.mutable.ArrayBuffer
var temp=new ArrayBuffer[ArrayBuffer[_>:Double]]

f.collect.foreach(x=> {
  temp += ArrayBuffer(x(0),x(1),x(2),x(3),x(4))
  println(temp)
})
println(temp)

You should be getting expected output.

Upvotes: 2

Related Questions