Reputation: 21
When we use show
or take
or write
actions in spark will all the data be sent to driver? If not, then why when we use collect
does all the data go to driver?
Upvotes: 2
Views: 1215
Reputation: 42352
show
and take
fetches the amount of data that you requested (e.g. 20 rows) onto the driver, while collect
fetches the data in the whole dataframe, across all partitions, onto the driver. write
will output the whole dataframe to a file location, but it's generally done in a partitioned manner, meaning that each executor can directly write the data contained in its partition to the file system.
Upvotes: 2