Reputation: 171
Accumulators and broadcast variables is a way of having shared variables in all nodes of cluster. Although, there are restrictions in their usage.
I am looking for a type of shared variable that lives in an executor and each task running in that executor can update and read it. Each executor may have a different instance of that type, so nothing will be transmitted over network. Are there any solutions to this?
Upvotes: 1
Views: 661
Reputation: 35229
On JVM
Yes. Every singleton object can act as "shared variable". This of course requires some form of synchronization if executor can update and read it, and as a result, it might be a serious bottleneck.
Also, if updates are not idempotent, you cannot guarantee correctness if task is recomputed.
In Python
No. Each worker runs in a separate process and there is no shared memory.
Upvotes: 1