aesthetics
aesthetics

Reputation: 81

why isn't this window function working? I can't use a group by

I am working in MySQL and I have this table:

url             x   y   count *
abc.com         1   1   2
abc.com         2   1   2
xyz.com         1   56  5
xyz.com         2   71  5
xyz.com         3   59  5
xyz.com         4   67  5
xyz.com         5   59  5

and I'm trying to run this code:

count(dh.x) over(partition by dh.url) 

in order to get a count of 2 for url of abc.com and a count of 5 for url of xyz.com to be the denominator I divide by in my query below...

that window function is nested in this larger CTE:

, stdev_estimates AS (   
    SELECT dh.url
    , CASE SUM(SQUARE(dh.x - xmean)) WHEN 0 THEN 1 ELSE SQRT(SUM(SQUARE(dh.x - xmean)) / (count(dh.x) over(partition by dh.url)) - 1) END AS xstdev
    , SQRT(SUM(SQUARE(dh.y - ymean)) / (count(dh.x) over(partition by dh.url)) - 1) AS ystdev
    , count(dh.x) over(partition by dh.url) 'denom'
    FROM date_hits dh
    INNER JOIN mean_estimates m ON m.url = dh.url
    GROUP BY dh.url, m.xmean, m.ymean, dh.x, dh.y
)
select *
from stdev_estimates

but when I try to run the inner subselect, I keep getting this error:

Msg 8120, Level 16, State 1, Line 49 Column 'date_hits.x' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

But it doesn't make sense because I thought this was a workaround needing a group by for the dh.x? I thought you don't need group by's if you're using an aggregating function.

Bottom line is I need to not group by dh.x because it breaks out my table into 7 rows when I'm aggregating everything down to:

url            xstdev    ystdev
abc.com       .12345    .5679
xyz.com       .23455    .79023

Upvotes: 0

Views: 558

Answers (1)

GMB
GMB

Reputation: 222472

What's within the window function must be an expression that complies with the outer group by.

If you remove dh.x from the group by clause, you cannot use it alone, you need an aggregate function on it.

Presumably, you want to replace this:

count(dh.x) over(partition by dh.url)

With:

sum(count(dh.x)) over(partition by dh.url)

Upvotes: 1

Related Questions