Reputation: 985
Right now I have scheduled queries via BQ interface. They work but do not scale or migrate very well (across dev and prod gcp projects). So I am trying to do scheduled queries in a way that is reproducible, scalable and migrate-able.
My queries are complicated and hence I am struggling with '
, "
and '''
to make it run via bq
commands and schedule via github actions.
This is the query which is most complicated:
declare bq_last_id int64;
declare external_sql string;
set bq_last_id = (select max(id) from bq_dataset.bq_table);
set external_sql = '"select * from mysql_table where id > ('|| bq_last_id ||')"';
execute immediate 'select * from external_query("my-gcp-project.my-region.my-connection-name",'|| external_sql || ');'
In total there are 20 something queries that have to scheduled. This one is the only one which is incremental or other are drop and recreate table again so they are not as complicated as this one.
WHAT I HAVE TRIED TILL NOW:
bq mk
command to run it with timestamp as a variable as shown in this answer. The problem with it is I still have to manually create the on-demand queries and I will have to do separately in dev and prod projects.bq
cmd.bq query
to run the queries (that is not create bq scheduled queries at all). and then later schedule them via gihub action.Any help with correct syntax or better suggestions to do this will be super helpful to me.
Thanks.
Upvotes: 1
Views: 1263
Reputation: 1945
The way I use to solve problems with scape characters when inserting queries in bq
commands is using jq
on shell and my queries on a file, as following:
queries.sql
file with your query script:cat queries.sql
declare bq_last_id int64;
declare external_sql string;
set bq_last_id = (select max(id) from bq_dataset.bq_table);
set external_sql = '"select * from mysql_table where id > ('|| bq_last_id ||')"';
execute immediate 'select * from external_query("my-gcp-project.my-region.my-connection-name",'|| external_sql || ');'
#!/bin/bash
set -f #avoind * used as wildcard
json=$(jq -nc --arg query "$(<queries.sql)" '{ "query": $query }')
#adapt the command and param to work in your environment (destination, tables, etc...)
bq mk \
--transfer_config \
--target_dataset=mydataset \
--display_name='My Scheduled Query' \
--params="$json" \
--data_source=scheduled_query \
--service_account_name=abcdef-test-sa@abcdef-test.iam.gserviceaccount.com
Upvotes: 1