Pruthviraj Shivanna
Pruthviraj Shivanna

Reputation: 175

Is it possible to use service accounts to schedule queries in BigQuery "Schedule Query" feature ?

We are using the Beta Scheduled query feature of BigQuery. Details: https://cloud.google.com/bigquery/docs/scheduling-queries

We have few ETL scheduled queries running overnight to optimize the aggregation and reduce query cost. It works well and there hasn't been much issues.

The problem arises when the person who scheduled the query using their own credentials leaves the organization. I know we can do "update credential" in such cases.

I read through the document and also gave it some try but couldn't really find if we can use a service account instead of individual accounts to schedule queries.

Service accounts are cleaner and ties up to the rest of the IAM framework and is not dependent on a single user.

So if you have any additional information regarding scheduled queries and service account please share.

Thanks for taking time to read the question and respond to it.

Regards

Upvotes: 4

Views: 7842

Answers (4)

VIKRANT KUMAR SINGH
VIKRANT KUMAR SINGH

Reputation: 41

This question is very old and came on this thread while I was searching for same. Yes, It is possible to use service account to schedule big query jobs.

While creating schedule query job, click on "Advance options", you will get option to select service account.

By default is uses credential of requesting user. Image from bigquery "create schedule query"1

Upvotes: 3

Deen酱
Deen酱

Reputation: 146

BigQuery Scheduled Query now does support creating a scheduled query with a service account and updating a scheduled query with a service account. Will these work for you?

Upvotes: 6

CM Ma
CM Ma

Reputation: 31

While it's not supported in BigQuery UI, it's possible to create a transfer (including a scheduled query) using python GCP SDK for DTS, or from BQ CLI.

The following is an example using Python SDK:

r"""Example of creating TransferConfig using service account.

Usage Example:
1. Install GCP BQ python client library.
2. If it has not been done, please grant p4 service account with
iam.serviceAccout.GetAccessTokens permission on your project.
  $ gcloud projects add-iam-policy-binding {user_project_id} \
   --member='serviceAccount:service-{user_project_number}@'\
   'gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com' \
   --role='roles/iam.serviceAccountTokenCreator'

   where {user_project_id} and {user_project_number} are the user project's
   project id and project number, respectively. E.g.,
  $ gcloud projects add-iam-policy-binding my-test-proj \
  --member='serviceAccount:service-123456789@'\
  'gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com'\
  --role='roles/iam.serviceAccountTokenCreator'

3. Set environment var PROJECT to your user project, and
GOOGLE_APPLICATION_CREDENTIALS to the service account key path. E.g.,
   $ export PROJECT_ID='my_project_id'
   $ export GOOGLE_APPLICATION_CREDENTIALS=./serviceacct-creds.json'
4. $ python3 ./create_transfer_config.py
"""

import os
from google.cloud import bigquery_datatransfer
from google.oauth2 import service_account
from google.protobuf.struct_pb2 import Struct

PROJECT = os.environ["PROJECT_ID"]
SA_KEY_PATH = os.environ["GOOGLE_APPLICATION_CREDENTIALS"]

credentials = (
    service_account.Credentials.from_service_account_file(SA_KEY_PATH))

client = bigquery_datatransfer.DataTransferServiceClient(
    credentials=credentials)
# Get full path to project
parent_base = client.project_path(PROJECT)

params = Struct()
params["query"] = "SELECT CURRENT_DATE() as date, RAND() as val"
transfer_config = {
    "destination_dataset_id": "my_data_set",
    "display_name": "scheduled_query_test",
    "data_source_id": "scheduled_query",
    "params": params,
}

parent = parent_base + "/locations/us"

response = client.create_transfer_config(parent, transfer_config)
print response

Upvotes: 3

Graham Polley
Graham Polley

Reputation: 14791

As far as I know, unfortunately you can't use a service account to directly schedule queries yet. Maybe a Googler will correct me, but the BigQuery docs implicitly state this:

https://cloud.google.com/bigquery/docs/scheduling-queries#quotas

A scheduled query is executed with the creator's credentials and project, as if you were executing the query yourself

If you need to use a service account (which is great practice BTW), then there are a few workarounds listed here. I've raised a FR here for posterity.

Upvotes: 2

Related Questions