Reputation: 547
Long story short, I am sending multiple batch queries after each other.
query_1
creates a table. Let's call it table_1
.
query_2
creates a table, let's call it table_2
, using table_1
.
When I send the second batch query, because the first one has not been run, BigQuery instantly returns:
Error: Not found: Table table_1 was not found in location US
.
However, as batch queries are queued, query_1
will inevitably run before query_2
, so when query_2
is actually run, table_1
will exist.
I have two ideas:
SELECT * FROM (table_1 if it exists else table_2)
. This would always evaluate to table_1
, as, as mentioned above, query_1
would complete prior to query_2
being launched.I have tried both, and failed, because:
Please note that it is not possible to send queries 1 and 2 at once. If it is relevant, the queries are sent via a cloud function on GCP. The function is written in node. The only options I use are location
, query
, and priority
("BATCH").
Thank you for any help you can provide!
Upvotes: 1
Views: 60
Reputation: 699
You can try following code to check for availability of table_1
, if its not exist then it should query table_2
.
IF (
SELECT COUNT(1) AS cnt
FROM `proj.dataset`.__TABLES_SUMMARY__
WHERE table_id = 'table_1'
) = 1
THEN
SELECT * from `proj.dataset.table_1`;
ELSE
SELECT * FROM `proj.dataset.table_2`;
END IF;
Upvotes: 0
Reputation: 969
Your second cloud function query has a dependancy to first one.
So it must be guaranteed that the first query is excuted when second one is processed.
I think there are 2 options in this case.
First is, Insert interval time to your cloud function code between first and second query.
I'm not sure of this scenario as you noted that it is not possible to send both queries at once.
Second is, Use Cloud Dataflow.
It is batch data processing service that can query bigQuery table.
This seems more suitable option than using cloud function jobs.
Dataflow also can process multiple query asynchronously, So your both queries are guaranteed to be in order.
If your cloud function is triggered by Cloud Pub/sub messages, Using Dataflow SQL is best practice for this scenario.
Upvotes: 1