Bruno
Bruno

Reputation: 78

SQL Group By a Partition By

This must be accomplished in MS SQL Server. I believe OVER( PARTITION BY) must be used, but I've failed at all my tries and I end up counting the records to each ID or something else...

I have this table:

| ID   | COLOR  |
+------+--------+
| 1    | Red    |
| 1    | Green  |
| 1    | Blue   |
| 2    | Red    |
| 2    | Green  |
| 2    | Blue   |
| 3    | Red    |
| 3    | Brown  |
| 3    | Orange |

Notice that ID = 1 and ID = 2 have precisely the same values for COLOR, however ID = 3 only shares the value COLOR = Red.

I would like to group the table as follows:

| COLOR  | COUNT | GROUPING |
+--------+-------+----------+
| Red    | 2     | Type 1   |
| Green  | 2     | Type 1   |
| Blue   | 2     | Type 1   |
| Red    | 1     | Type 2   |
| Brown  | 1     | Type 2   |
| Orange | 1     | Type 2   |

This would mean that ID = 1 and ID = 2 share the same 3 values for color and they are aggregated together as Type 1. Although ID = 3 shares one value for color to ID = 1 and ID = 2 (which is 'Red') the rest of the values are not shared, as such it is considered of Type 2 (different grouping).

The tables used are simple examples and are enough to replicate to the entire dateset, however each ID can have in theory hundreds of records with different values for colors in each row. However they are unique, one ID can't have the the same color in different rows.

My best attempt:

SELECT
    ID, 
    COLOR,
    CONCAT ('TYPE ', COUNT(8) OVER( PARTITION by ID)) AS COLOR_GROUP
FROM 
    {TABLE};

Result:

| ID   | COLOR  | GROUPING |
+------+--------+----------+
| 1    | Green  | Type 3   |
| 1    | Blue   | Type 3   |
| 1    | Red    | Type 3   |
| 2    | Green  | Type 3   |
| 2    | Blue   | Type 3   |
| 2    | Red    | Type 3   |
| 3    | Red    | Type 3   |
| 3    | Brown  | Type 3   |
| 3    | Orange | Type 3   |

Although the results are terrible I've tried different methods, none of them is better.

Hope I was clear enough.

Thank you for the help!

Upvotes: 0

Views: 437

Answers (1)

sacse
sacse

Reputation: 3744

try the following:

declare @t table ( ID  int,COLOR varchar(100))
insert into @t select 1   ,'Red'    
insert into @t select 1   ,'Green'  
insert into @t select 1   ,'Blue'   
insert into @t select 2   ,'Red'    
insert into @t select 2   ,'Green'  
insert into @t select 2   ,'Blue'   
insert into @t select 3   ,'Red'    
insert into @t select 3   ,'Brown'  
insert into @t select 3   ,'Orange'


select *, STUFF((SELECT CHAR(10) + ' '+COLOR
                  FROM @t t_in where t_in.ID=t.ID
                  order by COLOR
                   FOR XML PATH ('')) , 1, 1, '') COLOR_Combined 
into #temp
from @t t

select COLOR, count(color) [COUNT], 'TYPE ' + convert(varchar(10), dense_rank() OVER (order by [grouping])) [GROUPING]
from
(   
    select id, COLOR, COLOR_Combined,  (row_number() over (order by id) - row_number() over (partition by Color_Combined order by id)) [grouping]
    from #temp
)t
group by COLOR, [grouping]
drop table if exists #temp

Please find the db<>fiddle here.

Upvotes: 1

Related Questions