Umesh Gaikwad
Umesh Gaikwad

Reputation: 311

How to submit job on Dataproc cluster with specific service account?

I'm trying to execute jobs in the Dataproc cluster which access several resources of GCP like Google Cloud Storage.

My concern is whatever file or object is being created through my job is owned/created by Dataproc default user.

Example - [email protected].

Is there any way I can configure this user/service-account so that the object gets created by a given user/service-account instead of default one?

Upvotes: 2

Views: 2436

Answers (1)

Pradeep Bhadani
Pradeep Bhadani

Reputation: 4721

You can configure service account to be used by a Dataproc cluster using flag --service-account at cluster creation time.

Gcloud command would look like:

gcloud dataproc clusters create cluster-name \
  --service-account=your-service-account@project-id.iam.gserviceaccount.com

More details: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/service-accounts https://cloud.google.com/dataproc/docs/concepts/iam/iam

Note: it is better to have one dataproc cluster per job so that each job get isolated environment and doesnt affect each other and you can manage them better (in terms of security as well).

you can also look at GCP Composer using which you can schedule jobs and automate them.

Hope this helps.

Upvotes: 3

Related Questions