Oussama Fathallah
Oussama Fathallah

Reputation: 117

Apply partitioning over a list in bigquery

I applied partitioning in the fact table on the country column using the FARM_FINGERPRINT function (ABS(MOD(FARM_FINGERPRINT((COUNTRY)),4000)). Now based on a different table (table1) that contains the list of countries for each zone (for example in 'Europe' we have 'France', 'Germany', 'Espagne') I want to run a query that detect the list of countries inside a given zone and run a where clause based on that list applying partitioning (to avoid full scan). But when i run this query partitioning is not applied :

WITH
  step1 AS (
  SELECT
    ARRAY_AGG(ABS(MOD(FARM_FINGERPRINT((MULTIDIVISION_CLUSTER_CODE)),4000))) AS list
  FROM (
    SELECT
      DISTINCT(MULTIDIVISION_CLUSTER_CODE) AS MULTIDIVISION_CLUSTER_CODE
    FROM
      `project.dataset.table1` table1
    WHERE
       table1.MULTIDIVISION_ZONE = "Europe" )),
  step2 AS(
  SELECT
    *
  FROM
    `project.dataset.table2`
  WHERE
    _hash_partition IN UNNEST((select list from step1))
  )
SELECT
  *
FROM
  step2

For information if i replace "_hash_partition IN UNNEST((select list from step1)" with "_hash_partition IN (2591,287,3623,1537)" or "_hash_partition IN UNNEST(([2591,287,3623,1537]))" it works (query do not do a full scan)

Table1: (zone , country)

Table2: (date, zone, country, _hash_partition, mesure)

Upvotes: 0

Views: 130

Answers (1)

Jaytiger
Jaytiger

Reputation: 12234

You may try a dynamic sql below. FORMAT function will generate the query same as what you said works.

-- simplified query for *step1* CTE.
CREATE TEMP TABLE step1 AS SELECT [2591,287,3623,1537] AS list;

EXECUTE IMMEDIATE FORMAT("""
WITH step2 AS (
  SELECT
    *
  FROM
    `project.dataset.table2`
  WHERE
    _hash_partition IN UNNEST(%s)
)
SELECT * FROM step2;
""", (SELECT FORMAT('%t', list) FROM step1));

Upvotes: 2

Related Questions