Reputation: 167
I am trying to convert rdd to 2-D array. I am using below code for this -
import scala.collection.mutable.ArrayBuffer
var temp=new ArrayBuffer[ArrayBuffer[_>:Double]]
f.foreach(x=> {
temp:+= ArrayBuffer(x(0),x(1),x(2),x(3),x(4))
println(temp)
})
println(temp)
Here f is my rdd. println statement inside loop is working correctly. But when the outside println executes, it does not show anything. Can someone please explain why this is happening ? Thanks in advance.
Upvotes: 1
Views: 439
Reputation: 41957
As you haven't provide f
implementation I am guessing it as RDD[Array]
.
RDD
are distributed in nature. When we apply a function such as map
, foreach
, reduce
etc. on an RDD
they are executed in distributed manner i.e. since RDD
is already distributed the foreach
function was also carried out in distributed manner on executor
nodes. And since temp
is pointing to the ArrayBuffer
created on the driver
node, the distribute execution couldn't update the ArrayBuffer
pointed by temp
.
The correct solution would be to collect
f
before applying foreach
function as
import scala.collection.mutable.ArrayBuffer
var temp=new ArrayBuffer[ArrayBuffer[_>:Double]]
f.collect.foreach(x=> {
temp += ArrayBuffer(x(0),x(1),x(2),x(3),x(4))
println(temp)
})
println(temp)
You should be getting expected output.
Upvotes: 2