Creating a Weekly Scheduled Query Job in Google BigQuery Using Python

Question

I have a function that is supposed to query a table and save a table corresponding to the date the query job is done.

def Query_Schedule(request):
    from google.cloud import bigquery
    from google.cloud.bigquery.job import QueryJobConfig
    from datetime import datetime
    today = datetime.today()
    client = bigquery.Client(project='project-id')
    query_string = """SELECT * FROM dataset.table"""
    dataset = client.dataset('dataset')
    table = dataset.table('table_{}_{}_{}'.format(today.month, today.day, today.year))
    cfg = QueryJobConfig()
    cfg.use_legacy_sql = False
    cfg.destination = table
    cfg.write_disposition = 'WRITE_EMPTY'
    cfg.timeout_ms = 60000
    job = client.query(query=query_string, job_id='weekly-table-creation', job_config=cfg)
    return job

After referencing this question and adapting it to the most recent documentation I am still getting an error,

Error: function terminated. Recommended action: inspect logs for termination reason. Details: 409 POST https://bigquery.googleapis.com/bigquery/v2/projects/project-id/jobs: Already Exists: Job project-id:US.weekly-table-creation

(job ID: weekly-table-creation)

-----Query Job SQL Follows-----
| . | . | . | 1:SELECT * FROM dataset.table | . | . | . |

after I test the function. Is there any solution to this problem?

shollyman · Accepted Answer

The issue is that you're specifying a static job_id for all invocations. Jobs in BQ have unique job IDs, and you can't re-insert a job with the same ID. Either don't specify the ID at all, or provide a job ID that's sufficiently unique (e.g. suffixed based on the scheduled time, etc).

More info: https://cloud.google.com/bigquery/docs/running-jobs#generate-jobid

Creating a Weekly Scheduled Query Job in Google BigQuery Using Python

Answers (1)

Related Questions