Parvathy K
Parvathy K

Reputation: 79

how to update an RDD

I have and an RDD[(Int,Array[Double],Double, Double)].

val full_data = rdd.map(row => {
        val label = row._1
        val feature = row._2.map(_.toDouble)
        val QD = k_function(feature)
        val alpha = 0.0
        (label,feature,QD,alpha)
    })

Now I want to update the value of alpha in each record (say 10)

var tmp = full_data.map( x=> {
      x._4 = 10
    })

I got the error

Error: reassignment to val
         x._4 = 10

I have changed the all the val to var but still, the error occurs. How to update the value of alpha. and I would like to know how to update the full row or a specific row in an RDD.

Upvotes: 1

Views: 3968

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

RDD's are immutable in nature. They are made so for easy caching, sharing and replicating. Its always safe to copy than to mutate in a multi-threaded system like spark for fault tolerance and correctness in processing. Recreation of immutable data is much easier than mutable data.

Transformation is like copying the RDD data to another RDD every variables are treated as val i.e. they are immutable so if you are looking to replace the last double with 10, you can do is

var tmp = full_data.map( x=> {
      (x._1, x._2, x._3, 10)
    })

Upvotes: 3

Related Questions