Rahul Koshaley
Rahul Koshaley

Reputation: 201

Spark Accumulator value not read by task

I am initializing an accumulator

final Accumulator<Integer> accum = sc.accumulator(0);

And then while in map function , I'm trying to increment the accumulator , then using the accumulator value in setting a variable.

JavaRDD<UserSetGet> UserProfileRDD1 = temp.map(new Function<String, UserSetGet>() {

            @Override
            public UserSetGet call(String arg0) throws Exception {

                    UserSetGet usg = new UserSetGet();

                    accum.add(1);
                    usg.setPid(accum.value().toString();


            }
  });

But Im getting the following error.

16/03/14 09:12:58 ERROR executor.Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.UnsupportedOperationException: Can't read accumulator value in task

EDITED - As per the answer from Avihoo Mamka, getting accumulator value in tasks is not possible.

So is there anyway I can achieve the same in parallel. Such that the Pid value gets set each time a variable(eg like static variable) is incremented in my map function.

Upvotes: 2

Views: 4138

Answers (1)

Avihoo Mamka
Avihoo Mamka

Reputation: 4786

From the Spark docs

Accumulators are variables that are only “added” to through an associative operation and can therefore be efficiently supported in parallel. They can be used to implement counters (as in MapReduce) or sums

...

Only the driver program can read the accumulator’s value, using its value method.

Therefore, when trying to read the accumulator's value from within a task in Spark, means that you try to read its value from a worker, which is against the concept of reading the accumulator value only from the driver.

Upvotes: 6

Related Questions