busheriff
busheriff

Reputation: 165

scheduled query to partitioned and clustered table

I have created a BQ table with time partitioned (day/ timestamp) and also clustering data using region and location field.

I have created a daily scheduled query using the query that I ran to create the initial records. The scheduled query was set to append data to the previously created partitioned and clustered table. However, scheduled query failed to run with a message below:

Incompatible table partitioning specification. Destination table exists with partitioning specification interval(type:DAY,field:timestamp) clustering(location,region), but transfer target partitioning specification is interval(type:DAY,field:timestamp). Please retry after updating either the destination table or the transfer partitioning specification.

The simplified version of the query is below:

CREATE TEMP FUNCTION
  isYesterday(recordTimestamp TIMESTAMP) AS ( DATE(recordTimestamp) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) );

WITH
  daily AS (
  SELECT
    TIMESTAMP_TRUNC(timestamp,day) day,
    location,
    direction,
    region,
    MAX(metric1) metric1,
    MAX(metric2) metric2
  FROM (
    SELECT
      location,
      timestamp,
        direction,
        CASE
        WHEN area>0 AND area<20 THEN "region1"
        WHEN area>10
      AND area<20 THEN "region2"
        WHEN area>30 AND area<40 THEN "Region3"
        WHEN area=61 THEN "Region4"
        WHEN area=65 THEN "Region5"
        WHEN area=76 THEN "Region6"
        WHEN area>89 AND area<100 THEN "Region7"
    END
      AS region,
      
      SUM(terminalcount0) metric1,
      SUM(terminalcount1) metric2
    FROM
      `statistic`
    WHERE
  isYesterday(timestamp)
--      DATE(timestamp) BETWEEN '2020-07-01'
--      AND DATE_SUB(CURRENT_DATE(),INTERVAL 1 day)
        GROUP BY
      location,
      timestamp,
      direction,
      region )
  GROUP BY
    day,
    location,
    direction,
    region)
SELECT
  day as timestamp,
  location,
  direction,
  SUM(metric1) metric1,
  SUM(metric2) metric2
FROM
  daily
GROUP BY
  day,
  location,
  direction,
  region
  
  

How do I fix this so that scheduled query can run and save the results to the destination table regularly? Or is it really the scheduled query to the existing clustered table not supported by GCP?

Upvotes: 2

Views: 1753

Answers (1)

ldavid
ldavid

Reputation: 2552

There's an answer for this problem here: 61896744

You can workaround this problem by adding a view with your select query:

SELECT field_a, field_b, datetime, date
FROM large_table
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 WEEK)

And creating a scheduled query like this:

INSERT INTO `my_clustered_table` (field_a, field_b, ...)
SELECT *
FROM `my_view`

Upvotes: 3

Related Questions