Rich Murnane
Rich Murnane

Reputation: 2920

Python 2.7 & GCP Google BigQuery: CREATE TABLE AS SELECT.... (CTAS)

I'm using python 2.7 (can't change right now), and Google python client library v0.28 of google.cloud.bigquery, and I'm trying to figure out how I'd do a "create table XX as select a,b,c from Y where n='helloworld'".

We call this "CREATE TABLE AS SELECT" (CTAS) in other databases, though I'm not sure what the best approach is to do this in bq via python.

This is an interesting article which talks about using a job, but I think the bq python library only has a few types of jobs, so I'm not sure where to start with this one. (LoadJob, CopyJob, ExtractJob, and QueryJob) https://chartio.com/resources/tutorials/how-to-create-a-table-from-a-query-in-google-bigquery/

There's a bigquery_client.create_table function which is good, but I don't think I can set a query configuration field, or any real configuration fields in that.

So, any help you can provide or guidance would be appreciated. If you are here in the US, I hope you have a great Thanksgiving holiday.

Thank you very much and best regards to all my bigquery friends...Rich

BEGIN EDIT

This one can be closed, the help below and the following link solved what I need, I put my code in here for anyone else who might want it.

code hijacked from Create a table from query results in Google BigQuery

def create_table_as_select(dataset_name, table_name, sqlQuery, project=None):
    try:
    job_config = bigquery.QueryJobConfig()

    # Set configuration.query.destinationTable
    dataset_ref = bigquery_client.dataset(dataset_name)
    table_ref = dataset_ref.table(table_name)

    job_config.destination = table_ref

    # Set configuration.query.createDisposition
    job_config.create_disposition = 'CREATE_IF_NEEDED'

    # Set configuration.query.writeDisposition
    job_config.write_disposition = 'WRITE_APPEND'

    # Start the query
    job = bigquery_client.query(sqlQuery, job_config=job_config)

    # Wait for the query to finish
    job.result()

    returnMsg = 'Created table {} .'.format(table_name)

    return returnMsg

except Exception as e:
    errorStr = 'ERROR (create_table_as_select): ' + str(e)
    print(errorStr)
    raise

Upvotes: 0

Views: 630

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172954

guidance would be appreciated

DDL is not supported (hopefully yet) by BigQuery - so you should just use regular way: jobs.insert with respective destination table and write disposition

https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query

Upvotes: 2

Related Questions