Boy Wijnmaalen
Boy Wijnmaalen

Reputation: 249

big query materialized view; create a materialized view using an (inner) join

Hopefully someone can help me move forward cause it seems I'm stuck.

I've read the documentation carefully on how to create a materialized view, however I'm getting this unclear error;

Materialized view query contains unsupported feature.

I know the error is in the join, if I remove it, the error disappears as well, however inner joins should be supported when looking at the documentation, so I'm a little bit lost right now.

Background: we use firestore and its BigQuery Extension to get our collection into bigQuery. for performance reasons I'm looking into the materialized view option.

Question: Basically what I want is to get the latest document from the (Firestore Bigquery Extension) changelog table. I first gather the latest timestamp per document_id, I then want to retrieve the actual document (the data column) from the data table.

More specifically; can someone help to;

  1. A; figure out what I'm doing wrong
  2. B; or tell me how I create a materialized view where I;
    • always have access to the latest document version
    • exclude deleted documents (operation = 'DELETE`) from the result set

All help is very much appreciated!


CREATE OR REPLACE MATERIALIZED VIEW `XXX.XXX.view_name` AS
  WITH `ids` AS (
    SELECT
      `document_name`,
      MAX(`timestamp`) AS `timestamp`,
    FROM `XXX.XXX.id_table`
    GROUP BY `document_name`
  )

  SELECT
    `d`.`data`
  FROM `XXX.XXX.data_table` AS `d`
  INNER JOIN `ids` AS `i` ON (
    `i`.`timestamp` = `d`.`timestamp`
    `i`.`document_name` = `d`.`document_name`
  )
  WHERE `operation` != 'DELETE'
;

Upvotes: 0

Views: 1041

Answers (1)

Pratik Patil
Pratik Patil

Reputation: 832

This is my guess based the current documentation: Document says: "Aggregates in the materialized view query must be outputs. Computing or filtering based on an aggregated value is not supported."

In your "WITH" clause you have used MAX(timestamp) aggregate function and that column you are using it in join's ON computation. And probably thats why it not supported.

Upvotes: 2

Related Questions