Reputation: 201
I am initializing an accumulator
final Accumulator<Integer> accum = sc.accumulator(0);
And then while in map function , I'm trying to increment the accumulator , then using the accumulator value in setting a variable.
JavaRDD<UserSetGet> UserProfileRDD1 = temp.map(new Function<String, UserSetGet>() {
@Override
public UserSetGet call(String arg0) throws Exception {
UserSetGet usg = new UserSetGet();
accum.add(1);
usg.setPid(accum.value().toString();
}
});
But Im getting the following error.
16/03/14 09:12:58 ERROR executor.Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.UnsupportedOperationException: Can't read accumulator value in task
EDITED - As per the answer from Avihoo Mamka, getting accumulator value in tasks is not possible.
So is there anyway I can achieve the same in parallel. Such that the Pid value gets set each time a variable(eg like static variable) is incremented in my map function.
Upvotes: 2
Views: 4138
Reputation: 4786
From the Spark docs
Accumulators are variables that are only “added” to through an associative operation and can therefore be efficiently supported in parallel. They can be used to implement counters (as in MapReduce) or sums
...
Only the driver program can read the accumulator’s value, using its
value
method.
Therefore, when trying to read the accumulator's value from within a task in Spark, means that you try to read its value from a worker, which is against the concept of reading the accumulator value only from the driver.
Upvotes: 6