Reputation: 23
It is my understanding that Spark uses parallel IO to read files. That conclusion comes from other stack overflow responses.
My question is does spark read data using an independent approach or a collective approach? In other words, does each worker read a set chunk of data, or do the workers communicate with each other and collaborate to efficiently read data?
Upvotes: 1
Views: 991
Reputation: 73
The workers communicate by the driver And each worker process its own data
Upvotes: 1