Shane
Shane

Reputation: 33

SQL Aggregates OVER and PARTITION

All,

This is my first post on Stackoverflow, so go easy...

I am using SQL Server 2008.

I am fairly new to writing SQL queries, and I have a problem that I thought was pretty simple, but I've been fighting for 2 days. I have a set of data that looks like this:

UserId          Duration(Seconds)        Month
1               45                       January
1               90                       January
1               50                       February
1               42                       February
2               80                       January
2               110                      February
3               45                       January
3               62                       January
3               56                       January
3               60                       February

Now, what I want is to write a single query that gives me the average for a particular user and compares it against all user's average for that month. So the resulting dataset after a query for user #1 would look like this:

UserId         Duration(seconds)        OrganizationDuration(Seconds)        Month
1              67.5                     63                                   January
1              46                       65.5                                 February

I've been batting around different subqueries and group by scenarios and nothing ever seems to work. Lately, I've been trying OVER and PARTITION BY, but with no success there either. My latest query looks like this:

select Userid, 
       AVG(duration) OVER () as OrgAverage,
       AVG(duration) as UserAverage,
       DATENAME(mm,MONTH(StartDate)) as Month
            from table.name 
            where YEAR(StartDate)=2014
            AND userid=119 
                  GROUP BY MONTH(StartDate), UserId     

This query bombs out with a "Duration' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause" error.

Please keep in mind I'm dealing with a very large amount of data. I think I can make it work with CASE statements, but I'm looking for a cleaner, more efficient way to write the query if possible.

Thank you!

Upvotes: 1

Views: 2152

Answers (5)

CSharper
CSharper

Reputation: 5590

I was able to get it done using a self join, There's probably a better way.

Select UserId, AVG(t1.Duration) as Duration, t2.duration as OrgDur, t1.Month 
from #temp t1
inner join (Select Distinct MONTH, AVG(Duration) over (partition by Month) as duration
from #temp) t2 on t2.Month = t1.Month
group by t1.Month, t1.UserId, t2.Duration 
order by t1.UserId, Month desc

Here's using a CTE which is probably a better solution and definitely easier to read

With MonthlyAverage
as 
(
Select MONTH, AVG(Duration) as OrgDur 
from #temp
group by Month
)

Select UserId, AVG(t1.Duration) as Duration, m.duration as OrgDur , t1.Month 
from #temp t1
inner join MonthlyAverage m on m.Month = t1.Month
group by UserId, t1.Month, m.duration

Upvotes: 1

Karthik Kola
Karthik Kola

Reputation: 93

You can try below with less code.

SELECT Distinct UserID,
AVG(Duration)  OVER(PARTITION BY [Month]) AS TotalAvg,
AVG(Duration) OVER(PARTITION BY UserID, [Month] ORDER BY UserID) AS DetailedAvg, 
[Month]
FROM [dbo].[Test]

Upvotes: 0

Karthik Kola
Karthik Kola

Reputation: 93

Please try this. It works fine to me.

WITH C1
AS
(
SELECT 
AVG(Duration) AS TotalAvg, 
[Month]
FROM [dbo].[Test]
GROUP BY [Month]
),
C2
AS
(
SELECT Distinct UserID,
AVG(Duration) OVER(PARTITION BY UserID, [Month] ORDER BY UserID) AS DetailedAvg, 
[Month]
FROM [dbo].[Test]
)
SELECT C2.*, C1.TotalAvg
FROM C2 c2 
INNER JOIN C1 c1 ON c1.[Month] = c2.[Month]
ORDER BY c2.UserID, c2.[Month] desc;

Upvotes: 1

Ruskin
Ruskin

Reputation: 6171

You are joining two queries together here:

  • Per-User average per month
  • All Organisation average per month

If you are only going to return data for one user at a time then an inline select may give you joy:

SELECT AVG(a.duration) AS UserAvergage,
   (SELECT AVG(b.Duration) FROM tbl b WHERE MONTH(b.StartDate) = MONTH(a.StartDate)) AS OrgAverage 
    ...
    FROM tbl a
    WHERE userid = 119 
    GROUP BY MONTH(StartDate), UserId

Note - using comparison on MONTH may be slow - you may be better off having a CTE (Common Table Expression)

Upvotes: 1

Adi
Adi

Reputation: 232

missing partition clause in Average function

OVER ( Partition by MONTH(StartDate)) 

Upvotes: 1

Related Questions