Reputation: 43
The recently announced (and long-awaited) window analytical functions in Google Big Query seems very incomplete. Below is a query that was supposed to calculate the combined weighted score of several (3 in the example) financial indicators of a company (this works in Oracle and MS SQL!):
select
( percent_rank() over (order by cash_discount desc)*0.4
+percent_rank() over (order by trailing_pe desc)*0.5
+percent_rank() over (order by forward_pe desc)*0.5
) combined_weighted_score
,company_symbol, company_name
from workspace.company c
In GBQ we get: Error: Encountered " "OVER" "over "" at line 2.
I decided to research the issue further - here is an even simpler query:
select
percent_rank() over (order by cash_discount desc) r1
,percent_rank() over (order by trailing_pe desc) r2
,percent_rank() over (order by forward_pe desc) r3
,company_symbol, company_name
from workspace.company c
which gets me this: Error: Multiple distinct OVER clauses are not yet supported. If more than one OVER clauses appear in a query they must be identical. Found AnalyticSpecs that differ in sort keys 0
So, GBQ appears to have very incomplete implementation of analytical window functions! Next, I tried some ugly hacks like:
select c.company_name, (r1.r*0.4+r2.r*0.5) combined_weighted_score
from [workspace.company] c
left join (select company_symbol, percent_rank() over (order by cash_discount desc) r from workspace.company) r1
on c.company_symbol=r1.company_symbol
left join (select company_symbol, percent_rank() over (order by trailing_pe desc) r from workspace.company) r2
on c.company_symbol=r2.company_symbol
well, still does not work:
Error: 5.1 - 0.0: A query cannot have multiple JOIN clauses
Any solutions? I know this is very hard to implement in classic SQL, without the help of properly implemented window functions. But GBQ annoucement sounded very promissing ...
Upvotes: 0
Views: 299
Reputation: 59175
You almost got it!
Every OVER() clause needs the same predicate in the same SELECT statement, but you can JOIN 2 different tables - as you found out.
Only 2 tables can be joined, but you need 3. The easy workaround is to do a subquery and JOIN that one.
Instead of:
SELECT x.a, y.b, z.c
FROM x JOIN y ON x.j=y.j JOIN z ON y.j=z.j
you can do:
SELECT x.a, yz.b, yz.c
FROM x JOIN (
SELECT y.b b, z.c c, y.j j
FROM y JOIN z on y.j=z.j) yz
ON x.j=yz.j
(you might need to change the JOIN for a JOIN EACH)
Upvotes: 2