Reputation: 333
I have a function that is supposed to query a table and save a table corresponding to the date the query job is done.
def Query_Schedule(request):
from google.cloud import bigquery
from google.cloud.bigquery.job import QueryJobConfig
from datetime import datetime
today = datetime.today()
client = bigquery.Client(project='project-id')
query_string = """SELECT * FROM dataset.table"""
dataset = client.dataset('dataset')
table = dataset.table('table_{}_{}_{}'.format(today.month, today.day, today.year))
cfg = QueryJobConfig()
cfg.use_legacy_sql = False
cfg.destination = table
cfg.write_disposition = 'WRITE_EMPTY'
cfg.timeout_ms = 60000
job = client.query(query=query_string, job_id='weekly-table-creation', job_config=cfg)
return job
After referencing this question and adapting it to the most recent documentation I am still getting an error,
Error: function terminated. Recommended action: inspect logs for termination reason. Details: 409 POST https://bigquery.googleapis.com/bigquery/v2/projects/project-id/jobs: Already Exists: Job project-id:US.weekly-table-creation
(job ID: weekly-table-creation)
-----Query Job SQL Follows-----
| . | . | . |
1:SELECT * FROM dataset.table
| . | . | . |
after I test the function. Is there any solution to this problem?
Upvotes: 1
Views: 610
Reputation: 4384
The issue is that you're specifying a static job_id for all invocations. Jobs in BQ have unique job IDs, and you can't re-insert a job with the same ID. Either don't specify the ID at all, or provide a job ID that's sufficiently unique (e.g. suffixed based on the scheduled time, etc).
More info: https://cloud.google.com/bigquery/docs/running-jobs#generate-jobid
Upvotes: 3