noob'88
noob'88

Reputation: 117

BigQuery Over and Analytics Functions

I am trying to write a SQL leveraging BigQuery's "Over" functionality.

I'm essentially trying to do some compounding and smoother returns but I'm hitting a bit of a rut. My query looks like this:

WITH daily_extract AS (select date_adjusted,
                              sum(some_number) as 
                              daily_sum_of_some_number,
                              entity_id
                       from `some table`
                       where 1=1 -- no filter  
                       and   1=1 -- no filter  
                       group by date_adjusted, entity_id
                       order by date_adjusted
                       )
      select
      date_adjusted as date_adjusted,
      entity_id as entity_id,
      sum(log(1 + daily_sum_of_some_number)) 
          over (PARTITION by entity_id 
                ORDER     by date_adjusted 
                rows unbounded preceding) as smoothed_number
      from daily_extract
      group by 1,2

But what I get is an error such as this:

Failed to retrieve data - SELECT list expression references column daily_sum_return_selected_nav which is neither grouped nor aggregated at [16:19].

I tried to say group by 1,2,3 at the bottom but then I get the error:

Failed to retrieve data - Column 3 contains an analytic function, which is not allowed in GROUP BY at [18:20]

I'm reading the link below, but if anyone could help me out it would be greatly appreicated!

https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#analytic-functions

Upvotes: 1

Views: 3522

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271241

I can readily see at least three problems. Your CTE is aggregating by the wrong columns, the window frame clause is unnecessary, and you are missing an aggregation in the outer query.

You might have other problems as well, but this might help:

with daily_extract as (
      select date_adjusted, sum(some_number) as daily_sum_of_some_number,
             entity_id
      from `some table`
      group by date_adjusted, entity_id
------------------------------^
     )
select date_adjusted, entity_id,
      sum(log(1 + sum(daily_sum_of_some_number))) over
          (partition by entity_id 
           order by date_adjusted 
          ) as smoothed_number
from daily_extract
group by 1, 2;

Upvotes: 3

Related Questions