Reputation: 2544
Trying to use BigQuery which uses a Cloud SQL Postgres External Data Source. I'd like to keep it up to date every 15 minutes using a Scheduled Query so my dashboards can be up-to-date.
BigQuery appears to support parameters like @run_time which allow you to use BackFilling (it took me longer than 15 minutes to write this from the initial query that I want to append to, so I want to backfill).
I have a query like so in which I'd like to use that parameter to backfill minus 15 minutes:
MERGE INTO hd.users tgt
USING (
SELECT *
FROM EXTERNAL_QUERY("projects/my-project/locations/europe-west5/connections/hd",
'''
SELECT * FROM "User" WHERE "createdAt" >= @run_time - INTERVAL '15m'
''')
) src
ON src.id = tgt.id
WHEN MATCHED THEN
UPDATE SET id = src.id, createdAt = src.createdAt
WHEN NOT MATCHED THEN
INSERT (id, createdAt) VALUES (id, createdAt);
Of course it works if I replace @run_time with NOW() but then I lose the ability to back fill. I don't see any reference in the documentation but then also nowhere does it state that it's not supported.
Upvotes: 0
Views: 459
Reputation: 7277
It is not possible to use @run_time
in your external source as this parameter only exist in BigQuery. Also it is not the best practice to frequently query data from an external source since it is relatively slow compared to querying data from BigQuery.
What I suggest is to stream data (using Dataflow or create a Datafusion pipeline, etc) if your update frequency is around 15 minutes.
Upvotes: 1