Diego Navarro
Diego Navarro

Reputation: 9704

How can I set up scheduled queries in BigQuery with timezone support (through Python SDK)

In the BigQuery UI, I can schedule a query with a specific timezone (as you can see in the screenshot below).

With these settings, I'm able to schedule my query at the correct local time, but when I try to automate this process with Python I can't see any option to specify the timezone (https://cloud.google.com/bigquery/docs/scheduling-queries)

def create_scheduled_query(project_id, dataset_id, query_string, dest_table, write_disposition=WriteDisposition.WRITE_TRUNCATE):
    parent = client.project_path(project_id)

    transfer_config = google.protobuf.json_format.ParseDict(
        {
            "destination_dataset_id": dataset_id,
            "display_name": dest_table,
            "data_source_id": "scheduled_query",
            "params": {
                "query": query_string,
                "destination_table_name_template": dest_table,
                "write_disposition": write_disposition,
                "partitioning_field": "",
            },
            "schedule": "every day 06:00",  # How can add timezones here?
        },
        bigquery_datatransfer_v1.types.TransferConfig(),
    )

    response = client.create_transfer_config(
        parent, transfer_config, authorization_code=authorization_code
    )

    print("Created scheduled query '{}'".format(response.name))

Is there any way to accomplish this with the Python Client? Thanks!

Upvotes: 5

Views: 2531

Answers (2)

Alexandre Moraes
Alexandre Moraes

Reputation: 4032

After carefully reviewing the documentation, I concluded that selection the time zone within your Python script is not possible.

Within the APIs and Reference document, you can find out all the possible arguments for you scheduled query in transfer_config. The schedule parameter is described as :

Data transfer schedule. If the data source does not support a custom schedule, this should be empty. If it is empty, the default value for the data source will be used. The specified times are in UTC. Examples of valid format: 1st,3rd monday of month 15:30, every wed,fri of jan,jun 13:15, and first sunday of quarter 00:00.

Therefore, in the Console the time zone selected is used for display purpose only, since the time is converted to UTC Time in order to run your scheduled query.

In addition, just as a bonus information, I would like to point that in App Engine you can select the time zone for your scheduled job, using cron.yaml. You can see the timezone parameter here.

I hope it clarifies your doubt about it.

Upvotes: 2

Gareth P
Gareth P

Reputation: 106

Not that this is a perfect answer, but you should be passing in the time to execute and the time zone. Convert that to UTC before you put it in the transfer_config. (Or don't pass it in and just change it to UTC.

I completely agree that it should support TZs but it doesn't.... next!

Upvotes: 0

Related Questions