Reputation: 1492
I am trying to run the spark job on the google dataproc cluster as
gcloud dataproc jobs submit hadoop --cluster <cluster-name> \
--jar file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
--class org.apache.hadoop.examples.WordCount \
--arg1 \
--arg2 \
But the Job throws error
(gcloud.dataproc.jobs.submit.spark) PERMISSION_DENIED: Request had insufficient authentication scopes.
How do I add the auth scopes to run the JOB?
Upvotes: 9
Views: 8659
Reputation: 21
You Need to check the option for allowing the API access while creating the DataProc cluster. Then only you can submit the jobs to cluster using gcloud dataproc jobs submit
command
Upvotes: 0
Reputation: 10677
Usually if you're running into this error it's because of running gcloud from inside a GCE VM that's using VM-metadata controlled scopes, since otherwise gcloud installed on a local machine will typically already be using broad scopes to include all GCP operations.
For Dataproc access, when creating the VM from which you're running gcloud, you need to specify --scopes cloud-platform
from the CLI, or if creating the VM from the Cloud Console UI, you should select "Allow full access to all Cloud APIs":
As another commenter mentioned above, nowadays you can also update scopes on existing GCE instances to add the CLOUD_PLATFORM scope.
Upvotes: 21