Reputation: 11
I would like to know answer for below question.
How an RDD is processed if no of executors are less than partitions in RDD?
Upvotes: 0
Views: 160
Reputation: 42597
This is a very common situation; indeed, you would normally configure your jobs so that there are more tasks than executors (see https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/)
Spark will create a task for each partition, and share out the tasks amongst the available executors (remember that an executor may have multiple cores, so it can handle more than one task concurrently).
So each executor will handle its share of the partitions until they are all processed.
Spark will also try to give tasks to executors that are local to the data, where possible ("locality" - see What's the meaning of "Locality Level"on Spark cluster, for example) to minimise the amount of data that needs to be moved around the cluster.
Upvotes: 1