Reputation: 33
We are using apache beam through airflow. Default GCS account is set with environmental variable - GOOGLE_APPLICATION_CREDENTIALS. We don't want to change environmental variable as it might affect other processes running at that time. I couldn't find a way to change Google Cloud Dataflow Service Account programmatically. We are creating pipeline in following way p = beam.Pipeline(argv=self.conf)
Is there any option through argv or options, where in I can mention the location of gcs credential file? Searched through documentation, but didn't find much information.
Upvotes: 3
Views: 2094
Reputation: 916
You can specify a service account when you launch the job with a basic flag:
--serviceAccount=my-service-account-name@my-project.iam.gserviceaccount.com
That account will need the Dataflow Worker
role attached plus whatever else you would like(GCS/BQ/Etc). Details here. You don't need the SA to be stored in GCS, or keys locally to use it.
Upvotes: 5