howie
howie

Reputation: 2695

how to create dataproc cluster by service account

I am quite confused by this document enter link description here

Service account requirements and Limitations:
* Service accounts can only be set when a cluster is created.
* You need to create a service account before creating the Cloud Dataproc cluster that will be associated with the service account.
* Once set, the service account used for a cluster cannot be changed.

Dose this means I cannot create a service account, which have role to create a dataproc cluster? For Now, I can only create a dataproc cluster by my own account " gcloud auth login" But I want create dataproc cluster from jenkins by setup

gcloud auth activate-service-account --key-file

Upvotes: 0

Views: 2864

Answers (2)

Md Shihab Uddin
Md Shihab Uddin

Reputation: 561

First you need to create a service account and also need to provide access to the following roles:

  1. Dataproc Worker: According to [doc][1]

To create a cluster with a user-specified service account, the specified service account must have all permissions granted by the Dataproc Worker role.

2.Dataproc Hub Agent: This will provide access to act as service account permission, otherwise provide the following error:

ERROR: (gcloud.beta.dataproc.clusters.create) INVALID_ARGUMENT: User not authorized to act as service account '[email protected]'. To act as a service account, user must have one of [Owner, Editor, Service Account Actor] roles. See https://cloud.google.com/iam/docs/understanding-service-accounts for additional details.

3.Dataproc Editor: This role will provide access to create and delete the dataproc cluster.

Activate service account: After providing access to the roles, download the service account json. Activate the new service account by gcloud auth active-service-account --key-file=<service-json> . Check the activation by gcloud auth list. Set GOOGLE_APPLICATION_CREDENTIALS environment variable by export GOOGLE_APPLICATION_CREDENTIALS="service-json-full-path"

Now hopefully everything is ready to create dataproc cluster using service account. Here is the sample commands to create dataproc cluster using service account:

gcloud auth activate-service-account --key-file=<service-key-file>
export GOOGLE_APPLICATION_CREDENTIALS="<service-key-file>"
gcloud beta dataproc clusters create <CLUSTER-NAME> \
    --region=<REGION> \
    --project=<PROJECT-ID> \
    --service-account=<SERVICE-ACCOUNT-EMAIL> \
    --single-node

Upvotes: 0

tix
tix

Reputation: 2158

Yes, you can use a service account to create Dataproc clusters and submit jobs. However, the link you refer deals with running Dataproc clusters with a service account which isn't applicable to your concern.

To create a Dataproc using a service account:

  1. Create a service account

  2. Assign Cloud Dataproc Editor role

  3. Download its json credentials file

  4. Configure authentication mechanism:

    4.1 gcloud auth activate-service-account --key-file=JSON_FILE_PATH

    4.2 GOOGLE_APPLICATION_CREDENTIALS=JSON_FILE_PATH

  5. Create your Dataproc cluster

Upvotes: 1

Related Questions