creamcheese
creamcheese

Reputation: 2544

Can I use @run_time in EXTERNAL_QUERY when using Scheduled Query in BigQuery?

Trying to use BigQuery which uses a Cloud SQL Postgres External Data Source. I'd like to keep it up to date every 15 minutes using a Scheduled Query so my dashboards can be up-to-date.

BigQuery appears to support parameters like @run_time which allow you to use BackFilling (it took me longer than 15 minutes to write this from the initial query that I want to append to, so I want to backfill).

I have a query like so in which I'd like to use that parameter to backfill minus 15 minutes:

MERGE INTO hd.users tgt
USING (
  SELECT * 
  FROM EXTERNAL_QUERY("projects/my-project/locations/europe-west5/connections/hd",
  '''
    SELECT * FROM "User" WHERE "createdAt" >= @run_time - INTERVAL '15m'
  ''')
) src
ON src.id = tgt.id
WHEN MATCHED THEN
    UPDATE SET id = src.id, createdAt = src.createdAt
WHEN NOT MATCHED THEN
    INSERT (id, createdAt) VALUES (id, createdAt);

Of course it works if I replace @run_time with NOW() but then I lose the ability to back fill. I don't see any reference in the documentation but then also nowhere does it state that it's not supported.

Upvotes: 0

Views: 459

Answers (1)

Ricco D
Ricco D

Reputation: 7277

It is not possible to use @run_time in your external source as this parameter only exist in BigQuery. Also it is not the best practice to frequently query data from an external source since it is relatively slow compared to querying data from BigQuery.

What I suggest is to stream data (using Dataflow or create a Datafusion pipeline, etc) if your update frequency is around 15 minutes.

Upvotes: 1

Related Questions