Spark Accumulator value not read by task

Question

I am initializing an accumulator

final Accumulator accum = sc.accumulator(0);

And then while in map function , I'm trying to increment the accumulator , then using the accumulator value in setting a variable.

JavaRDD UserProfileRDD1 = temp.map(new Function() {

            @Override
            public UserSetGet call(String arg0) throws Exception {

                    UserSetGet usg = new UserSetGet();

                    accum.add(1);
                    usg.setPid(accum.value().toString();


            }
  });

But Im getting the following error.

16/03/14 09:12:58 ERROR executor.Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.UnsupportedOperationException: Can't read accumulator value in task

EDITED - As per the answer from Avihoo Mamka, getting accumulator value in tasks is not possible.

So is there anyway I can achieve the same in parallel. Such that the Pid value gets set each time a variable(eg like static variable) is incremented in my map function.

Avihoo Mamka · Accepted Answer

From the Spark docs

Accumulators are variables that are only “added” to through an associative operation and can therefore be efficiently supported in parallel. They can be used to implement counters (as in MapReduce) or sums

...

Only the driver program can read the accumulator’s value, using its value method.

Therefore, when trying to read the accumulator's value from within a task in Spark, means that you try to read its value from a worker, which is against the concept of reading the accumulator value only from the driver.

Spark Accumulator value not read by task

Answers (1)

Related Questions