Reputation: 707
I'm learning spark recently and confused about the transformation and action operation. I read the spark document and some books about spark, and I know action will cause a spark job to be executed in the cluster while transformation will not. But the operations of rdd listed in spark's api doc are not stated whether it is a transformation or an action operation.
For example, reduce is an action, on the other hand reduceByKey is a transformation! Why could this be.
Upvotes: 8
Views: 3367
Reputation: 67135
You can tell by looking at the return type. An action will return a non-RDD type (your stored value types usually), whereas a transformation will return an RDD[Type]
as it is still just a representation of your computation.
Upvotes: 15