Kron
Kron

Reputation: 483

SQL Server Count instances of most frequent data

I do not believe what I need to do is too complicated, but hey it's Monday morning and I could use a bit of advice. Basically I am doing an aggregate query in SQL Server in the following format:

SELECT [Data1], COUNT(*), MAX([Data2])
FROM [Source]
GROUP BY [Data1]

However I need a 4th field as well. A field that counts the number of times the value MAX([Data2]) occurs.

I will make a quick example here:

|Data 1|Data 2|
|1     |x     |
|3     |p     |
|1     |z     |
|3     |f     |
|1     |x     |
|1     |b     |
|2     |h     |
|1     |o     |
|2     |h     |
|1     |x     |
|3     |f     |
|2     |h     |
|1     |z     |

Needs to produce the output:

|Data1|Count|Max|Occurances|
|1    |7    |x  |3         |
|2    |3    |h  |3         |
|3    |3    |f  |2         |

Any help would be appreciated.

Upvotes: 1

Views: 71

Answers (4)

Serkan Arslan
Serkan Arslan

Reputation: 13393

you can try this.

DECLARE @Source AS TABLE([Data1] INT,[Data2] VARCHAR(5))
INSERT INTO @Source VALUES
(1, 'x'), 
(3, 'p'),
(1, 'z'),
(3, 'f'),
(1, 'x'),
(1, 'b'),
(2, 'h'),
(1, 'o'),
(2, 'h'),
(1, 'x'),
(3, 'f'),
(2, 'h'),
(1, 'z')

;WITH T AS (
SELECT [Data1] , 
    COUNT(*) OVER(PARTITION BY [Data1]) [Count], 
    [Data2],    
    COUNT(*) OVER(PARTITION BY [Data2]) Occurances
FROM @Source
)
SELECT TOP 1 WITH TIES [Data1], [Count], [Data2] [Max], Occurances
FROM T 
ORDER BY ROW_NUMBER() OVER (PARTITION BY [Data1] ORDER BY Occurances DESC) 

Result:

Data1       Count       Max   Occurances
----------- ----------- ----- -----------
1           7           x     3
2           3           h     3
3           3           f     2

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269693

What you are looking for has a name in statistics. You want the mode and the frequency of the mode.

I would approach this using two level of aggregation:

select data1, sum(cnt) as cnt,
       max(case when seqnum = 1 then data2 end) as mode,
       max(cnt) as mode_cnt
from (select data1, data2, count(*) as cnt,
             row_number() over (partition by data1 order by count(*) desc) as seqnum
      from t
      group by data1, data2
     ) t
group by data1;

Upvotes: 2

D-Shih
D-Shih

Reputation: 46219

You can try to use window function with cteto make it.

use CTE get COUNT Occurances by [Data 1] and [Data 2] column. then use another CTE2 make row number order by Occurances desc, which mean the max amount.

;with cte as (
SELECT [Data 1],
       [Data 2],
       COUNT(*) OVER (PARTITION BY [Data 1] ORDER BY [Data 1]) [count],
       COUNT(*) OVER (PARTITION BY [Data 2],[Data 1] ORDER BY [Data 2]) Occurances
FROM T
),cte2 as (
    SELECT  
        [Data 1], 
        [Data 2],
        [count],
        Occurances,row_number() over(partition by [Data 1] order by Occurances desc) rn
    FROM cte
)

select  [Data 1], 
        [Data 2],
        [count], 
        Occurances
from cte2
where rn = 1

sqlfiddle

Result

Data 1  Data 2  count   Occurances
1       x         7     3
2       h         3     3
3       f         3     2

Upvotes: 0

Abhishek
Abhishek

Reputation: 2490

Something like this should help -

;WITH CTE AS
(SELECT [Data1], COUNT(*) AS Data_Count, MAX([Data2]) AS Data_Max
FROM [Source]
GROUP BY [Data1])
SELECT [Data1],Data_Count,Data_Max,COUNT(Data_Max) AS Occurances
FROM CTE 
GROUP BY [Data1],Data_Count,Data_Max

Note: I would avoid using SQL Keywords as column names/aliases.

Upvotes: 0

Related Questions