srig
srig

Reputation: 33

how to provide credentials in apache beam python programmatically?

We are using apache beam through airflow. Default GCS account is set with environmental variable - GOOGLE_APPLICATION_CREDENTIALS. We don't want to change environmental variable as it might affect other processes running at that time. I couldn't find a way to change Google Cloud Dataflow Service Account programmatically. We are creating pipeline in following way p = beam.Pipeline(argv=self.conf)

Is there any option through argv or options, where in I can mention the location of gcs credential file? Searched through documentation, but didn't find much information.

Upvotes: 3

Views: 2094

Answers (1)

FridayPush
FridayPush

Reputation: 916

You can specify a service account when you launch the job with a basic flag: --serviceAccount=my-service-account-name@my-project.iam.gserviceaccount.com

That account will need the Dataflow Worker role attached plus whatever else you would like(GCS/BQ/Etc). Details here. You don't need the SA to be stored in GCS, or keys locally to use it.

Upvotes: 5

Related Questions