user912830823
user912830823

Reputation: 1352

Loading table from Cloud Storage to BigQuery using Python

Could someone share an example of a job config for uploading json newline_delimited file to a new Bigquery table, please?

Trying to do this based on google docs with no success so far.

Upvotes: 2

Views: 1211

Answers (1)

Willian Fuks
Willian Fuks

Reputation: 11797

This example from GCP repository is a good one for loading data from GCS.

The only thing you will have to adapt in your code is setting the job.source_format to be the new delimited json file, like so:

def load_data_from_gcs(dataset_name, table_name, source):
    bigquery_client = bigquery.Client()
    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(table_name)
    job_name = str(uuid.uuid4())

    job = bigquery_client.load_table_from_storage(
        job_name, table, source)

    job.source_format = 'NEWLINE_DELIMITED_JSON'
    job.begin()

    wait_for_job(job)

    print('Loaded {} rows into {}:{}.'.format(
        job.output_rows, dataset_name, table_name))

(The correct thing would be to receive this parameter as input in your function but this works as an example).

Also, the table should already exist when you run this code (I looked for schema auto-detection in the Python API but it seems there isn't one yet).

Upvotes: 2

Related Questions