Timo Rietveld
Timo Rietveld

Reputation: 541

BigQuery: How to flatten table in materialized view?

Via Google BigQuery, I'd like to visualize some data in Google Data Studio. Because the dataset is quite large in volume and I'd like to maximize the efficiency in data processing, I nested the data first (on both hit- and productlevel) with the following query (which is strongly simplified for illustration purposes) with as input a Google Analytics table (as by default imported from Google Analytics into BigQuery):

#standardSQL
SELECT
  visitorid, visitNumber, visitId, visitStartTime, date,
  ARRAY(
  SELECT
    AS STRUCT hits.hitNumber, hits.time, hits.hour,
    ARRAY(
    SELECT
      AS STRUCT product.productSKU, product.v2ProductName, product.productVariant
    FROM
      hits.product) AS productInfo
  FROM
    t.hits
  ORDER BY
    hits.hitNumber) AS hitInfo
FROM
  `[projectID].[DatasetID].ga_sessions_*` AS t
WHERE
  _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d",DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND FORMAT_DATE("%Y%m%d",DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Because I sensed Google Data Studio has issues dealing with nested data (incorrect aggregations), as a proposed solution I read elsewhere to flatten (i.e. unnest) the data in a materialized view first and connect the flattened data from this materialized view to Google Data Studio.

(Note: I could also have chosen to directly unnest the data in the above query and connect that to Google Data Studio, but I'd like to go for the 'materialized view'-solution because of data efficiency gain.)

Now, my question is: Does anyone know how to convert to an unnested format in the materialized view-query in this specific case? Reading the documentation, UNNEST() is not supported in materialized view, so I'm unsure how to do this.

Thanks in advance!

Upvotes: 0

Views: 1836

Answers (1)

Nick_Kh
Nick_Kh

Reputation: 5253

According to the documentation discovering BigQuery materialized views Limitations chapter:

A materialized view is limited to referencing only a single table, and is not able to use joins or UNNEST functionality

Having said this and following the discussion comments being posted by @Timo Rietveld and @Martin Weitzmann I assume that planning to visualize data in Google Data Studio and performing some aggregation functions it would probably required to flatten the data into the simple format using UNNEST operator achieving the best result.

For instance, event tables being imported from Google Analytics with multiple key-value parameters represented as an array in event column. You can flatten then each array element as a single row, simplifying data structure as well.

Any further comments or remarks will be highly appreciated, make wiki answer helping other contributors in their research.

Upvotes: 0

Related Questions