Reputation: 483
I do not believe what I need to do is too complicated, but hey it's Monday morning and I could use a bit of advice. Basically I am doing an aggregate query in SQL Server in the following format:
SELECT [Data1], COUNT(*), MAX([Data2])
FROM [Source]
GROUP BY [Data1]
However I need a 4th field as well.
A field that counts the number of times the value MAX([Data2])
occurs.
I will make a quick example here:
|Data 1|Data 2|
|1 |x |
|3 |p |
|1 |z |
|3 |f |
|1 |x |
|1 |b |
|2 |h |
|1 |o |
|2 |h |
|1 |x |
|3 |f |
|2 |h |
|1 |z |
Needs to produce the output:
|Data1|Count|Max|Occurances|
|1 |7 |x |3 |
|2 |3 |h |3 |
|3 |3 |f |2 |
Any help would be appreciated.
Upvotes: 1
Views: 71
Reputation: 13393
you can try this.
DECLARE @Source AS TABLE([Data1] INT,[Data2] VARCHAR(5))
INSERT INTO @Source VALUES
(1, 'x'),
(3, 'p'),
(1, 'z'),
(3, 'f'),
(1, 'x'),
(1, 'b'),
(2, 'h'),
(1, 'o'),
(2, 'h'),
(1, 'x'),
(3, 'f'),
(2, 'h'),
(1, 'z')
;WITH T AS (
SELECT [Data1] ,
COUNT(*) OVER(PARTITION BY [Data1]) [Count],
[Data2],
COUNT(*) OVER(PARTITION BY [Data2]) Occurances
FROM @Source
)
SELECT TOP 1 WITH TIES [Data1], [Count], [Data2] [Max], Occurances
FROM T
ORDER BY ROW_NUMBER() OVER (PARTITION BY [Data1] ORDER BY Occurances DESC)
Result:
Data1 Count Max Occurances
----------- ----------- ----- -----------
1 7 x 3
2 3 h 3
3 3 f 2
Upvotes: 0
Reputation: 1269693
What you are looking for has a name in statistics. You want the mode and the frequency of the mode.
I would approach this using two level of aggregation:
select data1, sum(cnt) as cnt,
max(case when seqnum = 1 then data2 end) as mode,
max(cnt) as mode_cnt
from (select data1, data2, count(*) as cnt,
row_number() over (partition by data1 order by count(*) desc) as seqnum
from t
group by data1, data2
) t
group by data1;
Upvotes: 2
Reputation: 46219
You can try to use window function with cte
to make it.
use CTE
get COUNT
Occurances
by [Data 1]
and [Data 2]
column. then use another CTE2
make row number order by Occurances
desc, which mean the max amount.
;with cte as (
SELECT [Data 1],
[Data 2],
COUNT(*) OVER (PARTITION BY [Data 1] ORDER BY [Data 1]) [count],
COUNT(*) OVER (PARTITION BY [Data 2],[Data 1] ORDER BY [Data 2]) Occurances
FROM T
),cte2 as (
SELECT
[Data 1],
[Data 2],
[count],
Occurances,row_number() over(partition by [Data 1] order by Occurances desc) rn
FROM cte
)
select [Data 1],
[Data 2],
[count],
Occurances
from cte2
where rn = 1
Result
Data 1 Data 2 count Occurances
1 x 7 3
2 h 3 3
3 f 3 2
Upvotes: 0
Reputation: 2490
Something like this should help -
;WITH CTE AS
(SELECT [Data1], COUNT(*) AS Data_Count, MAX([Data2]) AS Data_Max
FROM [Source]
GROUP BY [Data1])
SELECT [Data1],Data_Count,Data_Max,COUNT(Data_Max) AS Occurances
FROM CTE
GROUP BY [Data1],Data_Count,Data_Max
Note: I would avoid using SQL Keywords
as column names/aliases.
Upvotes: 0