Prashant Jadhav
Prashant Jadhav

Reputation: 31

Google OAuth 2.0 using Python for GCP BigQuery

I am seeking a code snippet for implementing oAuth 2.0 authentication using python to connect to GCP Big Query service.

I am using Google cloud shell to write the python code. But the access token I am receiving bad request.

access_token = google.fetch_token(token_url=token_url,client_id=client_id,client_secret=client_secret,authorization_response=redirect_response).

Also I need to automate this process so manually pasting the redirect_response needs to be avoided.

Upvotes: 2

Views: 5330

Answers (3)

Kristian Bengtsson
Kristian Bengtsson

Reputation: 19

You will need credentials for a serviceaccount exported to a json. GCP -> IAM and Admin -> Service Accounts and under the three little dots you will find create key for your account.

As mentioned in previous answers you will also need the BigQuery library

Then something like this would work

from google.cloud import bigquery
from google.oauth2 import service_account

def BigQuery():
  try:
    credentials = service_account.Credentials.from_service_account_file(
      '/Credentials.json')
    project_id = '[project_id]
    client = bigquery.Client(credentials= credentials,project=project_id)

    query = ('SELECT Column1, Column2 FROM `{}.{}.{}` limit 20'.format('[project_id]','[dataset]','[table]'))
    query_job = client.query(query)
    results = query_job.result()
    for row in results:
      print('Column1 1 : {}, Column 2: {}'.format(row.Column1, row.Column2))
  except:
    print('Error!')



if __name__ == '__main__':
  BigQuery()

Upvotes: 1

juferafo
juferafo

Reputation: 556

In the BigQuery Client Libraries it is documented how to set up the authentication both from the GCP console and the Command Line.

To employ the BigQuery API library you need to authenticate your service account. The gcloud command gcloud iam service-accounts keys create [FILE_NAME].json --iam-account [NAME]@[PROJECT_ID].iam.gserviceaccount.com generates a JSON key file with the necessary private information (like your project_id, private key, etc) to do so.

When making BigQuery API calls, you need to provide such credentials to your application code. It can be done by setting the environment variable GOOGLE_APPLICATION_CREDENTIALS pointing to the path of the service account JSON file

export GOOGLE_APPLICATION_CREDENTIALS="PATH/TO/SERVICE_ACCOUNT.json"

However, this will work only during your current shell session, so if this one expires or you open a new one you will need to set this variable again. Another way to authenticate the credentials is to employ the method
google.oauth2.Credentials.from_service_account_file inside of your Python script.

In the following Python code the service account is authenticated with the method google.oauth2.Credentials.from_service_account_file, a new BigQuery table is generated from a CSV file located in Google Cloud Storage and new data is inserted into such table.

from google.cloud import bigquery
from google.oauth2 import service_account

# Path to the service account credentials
key_path = "/PATH/TO/SERVICE-ACCOUNT.json"
credentials = service_account.Credentials.from_service_account_file(
    key_path,
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

# Instantiation of the BigQuery client
bigquery_client = bigquery.Client()

GCS_URI    = "gs://MY_BUCKET/MY_CSV_FILE"
DATASET_ID = "MY_DATASET"
TABLE_ID   = "MY_TABLE"

def bq_insert_from_gcs(target_uri = GCS_URI, dataset_id = DATASET_ID, table_id = TABLE_ID):
    """This method inserts a CSV file stored in GCS into a BigQuery Table."""

    dataset_ref = bigquery_client.dataset(dataset_id)

    job_config = bigquery.LoadJobConfig()
    # Schema autodetection enabled
    job_config.autodetect = True
    # Skipping first row which correspnds to the field names
    job_config.skip_leading_rows = 1
    # Format of the data in GCS
    job_config.source_format = bigquery.SourceFormat.CSV
    load_job = bigquery_client.load_table_from_uri(target_uri,\
                                                   dataset_ref.table(table_id),\
                                                   job_config=job_config)\

    print('Starting job {}'.format(load_job.job_id))
    print('Loading file {} into the Bigquery table {}'.format(target_uri, table_id))

    load_job.result()
    return 'Job finished.\n'


def bq_insert_to_table(rows_to_insert, dataset_id = DATASET_ID, table_id= TABLE_ID):
    """This method inserts rows into a BigQuery table"""

    # Prepares a reference to the dataset and table
    dataset_ref = bigquery_client.dataset(dataset_id)
    table_ref = dataset_ref.table(table_id)
    # API request to get table call
    table = bigquery_client.get_table(table_ref)

    # API request to insert the rows_to_insert
    print("Inserting rows into BigQuery table {}".format(table_id))
    errors = bigquery_client.insert_rows(table, rows_to_insert)
    assert errors == []


bq_insert_from_gcs()

rows_to_insert = [( u'Alice', u'cat'),\
                  (u'John', u'dog')]
bq_insert_to_table(rows_to_insert)

Also, I would strongly recommend to implement your script using Python 3, since Python 2 will no longer be supported by the google-cloud-bigquery from 01/01/2020.

Upvotes: 0

Kunal Deo
Kunal Deo

Reputation: 2308

It is recommended that you use the BigQuery Python client library. Pip package google-cloud-bigquery provides this. You will also need to setup GOOGLE_APPLICATION_CREDENTIALS with the service account json file.

Using this process you wont need to deal with token generation and renewal as this process is taken care by the client libraries in the background.

Please see BigQuery Client Libraries Python section for detailed instructions.

Upvotes: 0

Related Questions