Reputation: 311
I'm trying to execute jobs in the Dataproc cluster which access several resources of GCP like Google Cloud Storage.
My concern is whatever file or object is being created through my job is owned/created by Dataproc default user.
Example - [email protected]
.
Is there any way I can configure this user/service-account so that the object gets created by a given user/service-account instead of default one?
Upvotes: 2
Views: 2436
Reputation: 4721
You can configure service account to be used by a Dataproc cluster using flag --service-account
at cluster creation time.
Gcloud command would look like:
gcloud dataproc clusters create cluster-name \
--service-account=your-service-account@project-id.iam.gserviceaccount.com
More details: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/service-accounts https://cloud.google.com/dataproc/docs/concepts/iam/iam
Note: it is better to have one dataproc cluster per job so that each job get isolated environment and doesnt affect each other and you can manage them better (in terms of security as well).
you can also look at GCP Composer using which you can schedule jobs and automate them.
Hope this helps.
Upvotes: 3