How to access the value of accumulator in tasks?

Question

I am trying to access an accumulator's value while in a task of a cluster. But when I do so it throws an exception:

can't read the accumulator's value

I tried to use the row.localValue but it returns the same numbers. Is there a workaround?

private def modifyDataset(
  data: String, row: org.apache.spark.Accumulator[Int]): Array[Int] = {

  var line = data.split(",")
  var lineSize = line.size      
  var pairArray = new Array[Int](lineSize-1)
  var a = row.value
  paiArray(0)=a

  row+=1
  pairArray

}


var sc = Spark_Context.InitializeSpark
var row = sc.accumulator(1, "Rows")

var dataset = sc.textFile("path")

var pairInfoFile = noHeaderRdd.flatMap{ data => modifyDataset(data,row) }
  .persist(StorageLevel.MEMORY_AND_DISK)        
pairInfoFile.count()

How to access the value of accumulator in tasks?

Answers (1)

Related Questions