Reputation: 78
Because it is client data I've replace in this post the project name and the dataset name by ******)
I'm trying to create a new schedule query in BigQuery on Google cloud platform The problem is I've got this error in the web Query editor
Cannot query over table '******.raw_bounce_rate' without a filter over column(s) 'dt' that can be used for partition elimination
The thing is I do filter on the column dt.
Here is the scheme of my external partitioned table
Tracking_Code STRING
Pages STRING NULLABLE
Clicks_to_Page INTEGER
Path_Lengths INTEGER
Visit_Number INTEGER
Visitor_ID STRING
Mobile_Device_Type STRING
All_Visits INTEGER
dt DATE
dt is the field of the partition and I selected the option "Require partition filter"
Here is the simplify sql of my query
WITH yesterday_raw_bounce_rate AS (
SELECT *
FROM `******.raw_bounce_rate`
WHERE dt = DATE_SUB(@run_date, INTERVAL 1 DAY)
),
entries_table as (
SELECT dt,
ifnull(Tracking_Code, "sans campagne") as tracking_code,
ifnull(Pages, "page non trackée") as pages,
Visitor_ID,
Path_Lengths,
Clicks_to_Page,
SUM(all_visits) AS somme_visites
FROM
yesterday_raw_bounce_rate
GROUP BY
dt,
Tracking_Code,
Pages,
Visitor_ID,
Path_Lengths,
Clicks_to_Page
HAVING
somme_visites = 1 and Clicks_to_Page = 1
)
select * from entries_table
if I remove the statement
Clicks_to_Page = 1
or if I replace the
DATE_SUB(@run_date, INTERVAL 1 DAY)
by a hard coded date
the query is accepted by Big Query, it does not make sense to me
Upvotes: 1
Views: 573
Reputation: 4042
Currently, there is an open issue, here. It addresses the error regarding using @run_date filter in the filter of scheduled queries to partitioned tables with required filter. The engineering team is currently working on it, although there is no ETA.
In your scheduled query, you can use one of the two workarounds using @run_date.As follows:
First option,
DECLARE runDateVariable DATE DEFAULT @run_date;
#your code...
WHERE date = DATE_SUB(runDateVariable, INTERVAL 1 DAY)
Second option,
DECLARE runDateVariable DATE DEFAULT CAST(@run_date AS DATE);
#your code...
WHERE date = DATE_SUB(runDateVariable, INTERVAL 1 DAY)
In addition, you can also use CURRENT_DATE() instead of @run_date, as shwon below:
DECLARE runDateVariable DATE DEFAULT CURRENT_DATE();
#your code...
WHERE date = DATE_SUB(runDateVariable, INTERVAL 1 DAY)
UPDATE
I have set up another scheduled query to run daily with a table partitioned by DATE from a field called date_formatted
and the partition filter is required. Then I have set up a backfill, here, so I could see the result of the scheduled query for previous days. Below is the code I used:
DECLARE runDateVariable DATE DEFAULT @run_date;
SELECT @run_date as run_date, date_formatted, fullvisitorId FROM `project_id.dataset.table_name` WHERE date_formatted > DATE_SUB(runDateVariable, INTERVAL 1 DAY)
Upvotes: 1