Jun Wang
Jun Wang

Reputation: 707

how to distinguish an operation in spark is a transformation or an action?

I'm learning spark recently and confused about the transformation and action operation. I read the spark document and some books about spark, and I know action will cause a spark job to be executed in the cluster while transformation will not. But the operations of rdd listed in spark's api doc are not stated whether it is a transformation or an action operation.

For example, reduce is an action, on the other hand reduceByKey is a transformation! Why could this be.

Upvotes: 8

Views: 3367

Answers (1)

Justin Pihony
Justin Pihony

Reputation: 67135

You can tell by looking at the return type. An action will return a non-RDD type (your stored value types usually), whereas a transformation will return an RDD[Type] as it is still just a representation of your computation.

Upvotes: 15

Related Questions