Reputation: 2621
I am completely new to GCP. Is it the user who must manage the amount of allocated memory for the driver and workers and the number of CPUs to run the Spark job in a Dataproc cluster? If yes, what are the aspects of Elasticity for Dataproc usage?
Thank you.
Upvotes: 1
Views: 1018
Reputation: 26548
Usually you don't, resources of a Dataproc cluster are managed by YARN, Spark jobs are automatically configured to make use of them. In particular, Spark dynamic allocation is enabled by default. But your application code still matters, e.g., you need to specify the appropriate number of partitions.
Upvotes: 2