Reputation: 111
We are trying to run the Dataproc cluster in a cluster mode, but failing to do so. We have tried the property --properties spark.submit.deployMode=cluster
, but failed.
Can someone give more info on how to setup?
Thanks in advance.
Upvotes: 3
Views: 344
Reputation: 4465
Seems like the problem is that you didn't specify spark:
prefix when setting spark.submit.deployMode
property during cluster creation.
In Dataproc, if you set properties during cluster creation time you need prefix them with the component that you are setting them for, see Dataproc cluster properties documentation for details.
This command should work to create cluster on which Spark jobs will be submitted in cluster mode:
CLUSTER_NAME=<cluster_name>
gcloud dataproc clusters create ${CLUSTER_NAME} \
--properties=spark:spark.submit.deployMode=cluster
Note, that in cluster mode Dataproc will not be able to stream Spark driver output in gcloud and Cloud Console.
Upvotes: 2