Reputation: 6196
For example, i got one spark session and this session only contains one action and lots of transformations. And no partition will failed during the task executets. So does cache
is unnecessary in this case ? Because cache is used for share rdd between actions.
Upvotes: 0
Views: 110
Reputation: 10092
You pretty much answered your own question.
cache
will only get into effect after at least one action is called on the RDD you cached. This means that the entire DAG of the RDD needs to be computed from scratch at least once.
Since you only have one action, cache
will not do anything. Except eating up your executor memory.
Upvotes: 3