Reputation: 51200
I have a data set that looks like this:
User | Task | Time
--------|--------|--------
User A | Task X | 100
User A | Task Y | 200
User A | Task Z | 300
User B | Task X | 400
User B | Task Y | 500
User B | Task Z | 600
User C | Task X | 700
User C | Task Y | 800
User C | Task Z | 900
User D | Task X | 1000
User D | Task Y | 1100
user D | Task Z | 1200
When I do my initial grouping, the data looks like this:
| Avg User | Avg Task X | Avg Task Y | Avg Task Z
User | Time | Time | Time | Time
-------|----------|------------|------------|------------
User A | 200 | 100 | 200 | 300
User B | 500 | 400 | 500 | 600
User C | 800 | 700 | 800 | 900
User D | 1100 | 1000 | 1100 | 1200
I need it to look like this:
| Avg User | Avg Task X | Avg Task Y | Avg Task Z
User | Time | Time | Time | Time
------|----------|------------|------------|------------
All | 650 | 550 | 650 | 750
This is how I got those numbers:
650 = (200+500+800+1100) / 4
550 = (100+400+700+1000) / 4
650 = (200+500+800+1100) / 4
750 = (300+600+900+1200) / 4
In other words, I have a column group on Task and a row group on User. The problem is that I want the row group to get summarized an extra time.
At first glance I could just return the user's name back as 'All' and it would summarize but this doesn't actually give me the averages that I need. I need to first SUM the times by user, and then find the average per user. If I change the way the original data is shaped, my task groups will no longer work properly.
If I try to use a "Totals" row on my row group, it aggregates the ORIGINAL data and not the summarized/grouped data. That is rather disappointing because it is actually incorrect in my eyes.
Upvotes: 3
Views: 659
Reputation: 48169
Here's the query I would write that works... The "PreQuery" is done to group the counts and sum of each element for a given user... Then that is rolled-up to the top-most level of "All". Now, this is based on your data sample.
SELECT
AVG( TaskTime / TaskCount ) as TaskAvg,
SUM( XTime ) / SUM( XCount ) as XAvg,
SUM( YTime ) / SUM( YCount ) as YAvg,
SUM( ZTime ) / SUM( ZCount ) as ZAvg
from
( SELECT
user,
COUNT(*) as TaskCount,
SUM( Time ) as TaskTime,
CASE WHEN Task = "Task X" THEN 1 ELSE 0 END as XCount,
CASE WHEN Task = "Task X" THEN Time ELSE 0 END as XTime,
CASE WHEN Task = "Task Y" THEN 1 ELSE 0 END as YCount,
CASE WHEN Task = "Task Y" THEN Time ELSE 0 END as YTime,
CASE WHEN Task = "Task Z" THEN 1 ELSE 0 END as ZCount,
CASE WHEN Task = "Task Z" THEN Time ELSE 0 END as ZTime
FROM
AllUsersTasks
group by ;
user ) PreQuery
If your data could provide that a given user has multiple entries for a single Task, such as 3 entries for User A, Task X has Times of 95, 100 and 105, you have 3 entries for 300 which results in the 100. This could skew your OVERALL Average of this task and would have to modify the query. Let me know if a person will have multiple entries per a given task based on production data... If so, then THAT element would probably need to be put into its OWN pre-query where the "From AllUserTasks" table is.
Upvotes: 0
Reputation: 938
I would do this in a sql script, doing this in reporting would be overkill (although it probably would be possible).
I have and example script right here:
drop table #tmp, #tmp2, #tmp3
select 'User A' as [User],' Task X ' as [Task],100.00 as [Time]
into #tmp
union all
select 'User A ',' Task Y ',200
union all
select 'User A ',' Task Z ',300
union all
select 'User B ',' Task X ',400
union all
select 'User B ',' Task Y ',500
union all
select 'User B ',' Task Z ',600
union all
select 'User C ',' Task X ',700
union all
select 'User C ',' Task Y ',800
union all
select 'User C ',' Task Z ',900
union all
select 'User D ',' Task X ',1000
union all
select 'User D ',' Task Y ',1100
union all
select 'User D ',' Task Z ',1200
select [User],
Task,
Sum(time) as time
into #tmp2
from #tmp
group by [User],
[Task]
select [User],
avg(time) as time
into #tmp3
from #tmp2
group by [User];
declare @statement nvarchar(max);
select @statement =
'with cteTimes as (
select *
from #tmp2 t
pivot (sum (t.[time]) for Task in (' + stuff((select ', ' + quotename([Task]) from #tmp group by [Task] for xml path, type).value('.','varchar(max)'), 1, 2, '') + ')) as Task
)
select ''All'' as [User],
(select avg(usr.time) from #tmp3 usr),'
+ stuff((select ', avg(' + quotename([Task]) + ') as ' + quotename([Task]) from #tmp group by [Task] for xml path, type).value('.','varchar(max)'), 1, 2, '') +
+'from cteTimes x ';
exec sp_executesql @statement;
The script can probably be optimized by using a pivot instead of multiple joins while creating the #tmp4. My example is just explanatory.
Upvotes: 0
Reputation: 3690
Assuming your source is SQL Server 2008 you might be able to use a combination of grouping sets:
http://technet.microsoft.com/en-us/library/bb522495.aspx
And the SSRS Aggregate Function:
http://msdn.microsoft.com/en-us/library/ms155830(v=sql.90).aspx
This blog has an example that may also be helpful
Good Luck
Upvotes: 0
Reputation: 3054
The only way I was able to do this type of functionality is to was to use the Code section of the report. I would keep track of the group data I wanted to summarize in a global variable in that I would later output to the field that I wanted.
Here is a microsoft article to describe how to embed code into your report http://msdn.microsoft.com/en-us/library/ms159238.aspx
Here is a much more detailed way to solve your problem. Link
Upvotes: 1