Reputation: 1

SQL SELECT Only 1 column when there is a duplicate in another column

First let me start off by saying I only have read privileges for our database.

I currently have multiple rows that have the columns with the same value:

Example

Row | col1 | col2 | col3 | col4 | col5 |

1   | AA   | BB   | CC   | AA   | CC   |
2   | DD   | CC   | DD   | BB   | BB   |
3   | ZZ   | ZZ   | CC   | CC   | BB   |
4   | AA   | AA   | DD   | AA   | AA   |

I am trying to only select values that are unique across the rows

1| AA | BB | CC 
2| DD | CC | BB
3| ZZ | CC | BB 
4| AA | DD |

I am really not sure where to begin. Kinda stumped on this one.

Upvotes: 0

Answers (4)

John Cappelletti

Reputation: 81930

If 2016+ and you have numerous or variable columns

Example

Declare @YourTable Table ([Row] varchar(50),[col1] varchar(50),[col2] varchar(50),[col3] varchar(50),[col4] varchar(50),[col5] varchar(50))  Insert Into @YourTable Values 
 (1,'AA','BB','CC','AA','CC')
,(2,'DD','CC','DD','BB','BB')
,(3,'ZZ','ZZ','CC','CC','BB')
,(4,'AA','AA','DD','AA','AA')

Select *
 From  (
        Select A.[Row]
              ,B.* 
         From  @YourTable A
         Cross Apply (
                        Select [Value] 
                              ,ColNr = concat('Col',dense_rank() over (order by value))
                         From  OpenJson( (Select A.* For JSON Path,Without_Array_Wrapper ) ) 
                         Where [Key] not in ('Row')
                     ) B
      ) src
 Pivot (max(Value) for ColNr in ([Col1],[Col2],[Col3],[Col4],[Col5]) ) pvt

Returns

Row Col1    Col2    Col3    Col4    Col5
1   AA      BB      CC      NULL    NULL
2   BB      CC      DD      NULL    NULL
3   BB      CC      ZZ      NULL    NULL
4   AA      DD      NULL    NULL    NULL

Upvotes: 0

Damien_The_Unbeliever

Reputation: 239636

This is very unpretty because someone's treated an SQL table as a spreadsheet, acting as if rows and columns are pretty well the same thing. Here's one approach that builds up a solution:

declare @t table (Row int, col1 char(2), col2 char(2), col3 char(2), col4 char(2), col5 char(2))
insert into @t(Row,col1,col2,col3,col4,col5) values
(1,'AA','BB','CC','AA','CC'),
(2,'DD','CC','DD','BB','BB'),
(3,'ZZ','ZZ','CC','CC','BB'),
(4,'AA','AA','DD','AA','AA')

select * from (
select Row,Foo,'col' + CONVERT(varchar(10),ROW_NUMBER() OVER (PARTITION BY Row ORDER BY Bar)) as Bar2  from (
select Row,Foo,Bar from (
select
    *,ROW_NUMBER() OVER (PARTITION BY Row,Foo ORDER BY Bar) rn
from @t
unpivot (Foo for Bar in (col1,col2,col3,col4,col5)) u
) v
where rn = 1
) w
) x
pivot (MAX(Foo) for Bar2 in (col1,col2,col3,col4,col5)) y

Upvotes: 0

Thom A

Reputation: 95554

Seems like you're going to need to unpivot the data, get the DENSE_RANK and then repivot:

WITH YourTable AS(
    SELECT *
    FROM (VALUES(1,'AA','BB','CC','AA','CC'),
                (2,'DD','CC','DD','BB','BB'),
                (3,'ZZ','ZZ','CC','CC','BB'),
                (4,'AA','AA','DD','AA','AA'))V([Row],Col1,Col2,Col3,Col4,Col5)),
Unpvt AS(
    SELECT YT.[Row],
           V.Col,
           DENSE_RANK() OVER (PARTITION BY YT.[Row] ORDER BY V.Col) AS DR
    FROM YourTable YT
         CROSS APPLY (VALUES(YT.Col1),
                            (YT.Col2),
                            (YT.Col3),
                            (YT.Col4),
                            (YT.Col5))V(Col))
SELECT U.[Row],
       MAX(CASE U.DR WHEN 1 THEN U.Col END) AS Col1,
       MAX(CASE U.DR WHEN 2 THEN U.Col END) AS Col2,
       MAX(CASE U.DR WHEN 3 THEN U.Col END) AS Col3,
       MAX(CASE U.DR WHEN 4 THEN U.Col END) AS Col4,
       MAX(CASE U.DR WHEN 5 THEN U.Col END) AS Col5
FROM Unpvt U
GROUP BY U.[Row];

Upvotes: 1

Gordon Linoff

Reputation: 1269473

This is a little tricky, but you can use apply and conditional aggregation:

select t.row, c.*
from t cross apply
     (select max(case when seqnum = 1 then col end) as col1,
             max(case when seqnum = 2 then col end) as col2,
             max(case when seqnum = 3 then col end) as col3,
             max(case when seqnum = 4 then col end) as col4,
             max(case when seqnum = 5 then col end) as col5
      from (select col,
                   row_number() over (order by min(pos)) as seqnum
            from (values (1, t.col1), (2, t.col2), (3, t.col3), (4, t.col4), (5, t.col5))
                 ) v(pos, col)
            group by col
      ) c;

Upvotes: 0

SQL SELECT Only 1 column when there is a duplicate in another column

Answers (4)

Related Questions