J.Done
J.Done

Reputation: 3033

Scala - Keep Map in foreach

    var myMap:Map[String, Int] = Map()
    myRDD.foreach { data =>
        println( "1. " + data.name + " : " + data.time)
        myMap += ( data.name -> data.time)
        println( "2. " + myMap)
    }
    println( "Total Map : " + myMap)

Result

  1. A : 1
  2. Map(A -> 1)
  3. B: 2
  4. Map(B -> 2) // deleted key A
  5. C: 3
  6. Map(C -> 3) // deleted Key A and B

Total Map : Map() // nothing

Somehow I cannot store Map data in foreach. It kept deleting or initialing previous data when adding new key&value. Any Idea of this?

Upvotes: 1

Views: 55

Answers (1)

maasg
maasg

Reputation: 37435

Spark closures are serialized and executed in a separate context (remotely when in a cluster). myMap variable will not be updated locally.

To get the data from the RDD as a map, there's a built-in operation:

val myMap = rdd.collectAsMap()

Upvotes: 1

Related Questions