Tunneller
Tunneller

Reputation: 599

How can I determine if BigQuery is doing an incremental update for Materialized View automatic refresh

This is a follow-on question to::

Understanding Google Bigquery materialized views pricing

I've made a materialized view that uses a reasonably complex set of queries against a 50GByte table. Data is getting updated on the table every 10-20 minutes and I've set the materialized view to update every 30 minutes. Perfect..... Except what I really want is for the materialized view to only run its query on the most recent data. The Google documentation says, in effect, that it will do exactly that as long as the query is "simple". My query has JSON extracts, some REGEXPS and a few CASE statements, but no SQL UDF's or Javascript.

I actually would like to have some complicated UDF's.... but not if that would tip a "simple" into something that caused a full 50GByte refresh each time.

How can I determine if my query has been deemed simple enough that it is not querying the full 50GByte every thirty minutes? Other than waiting to see my bill at the end of the month...

Upvotes: 1

Views: 889

Answers (1)

Tunneller
Tunneller

Reputation: 599

Some progress. First even though I am admin, I needed to add to IAM

• roles/bigquery.metadataViewer

Without that, nothing worked. Once I had done that, then the following query showed the calls to BQ.REFRESH_MATERIALIZED_VIEW

SELECT
  creation_time,job_id,query, total_slot_ms, total_bytes_processed,
  materialized_view_statistics.materialized_view[SAFE_OFFSET(0)].rejected_reason
  AS full_refresh_reason
FROM
  `region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECT`
WHERE
  job_id LIKE '%materialized_view_refresh_%'
ORDER BY creation_time DESC LIMIT 20

Now I see that some of my Material Views are behaving correctly: the first time they are created the full-refresh-reason is flagged as NO DATA and the amount of data consumed is large, but all subsequent refreshes get flagged as reason "null" and the amount of data consumed is pleasantly small.

Unfortunately, my big beastie is always getting flagged as NO DATA so every time it refreshes then it chews up the entire 50 GByte, which kind of misses the point... but at least now I have something to get started.

Upvotes: 0

Related Questions