SQL Query to find duplicate data across columns

Question

I am trying to query a DB2 table to find duplicates across columns. To be more specific, I am looking for rows where there are duplicate columns but not necessarily across the same column name.

Example.

Graph_Table has four columns of coordinates

Row1

Column1 = -1
Column2 = -4
column3 = 6
column4 = 3

Row7

Column1 = 3
Column2 = -1
Column3 = 6
Column4 = -4

The query would count this as a duplicate or return the result set as duplicates because the coordinates are the same but not from the same column. I have researched several queries but they all operate using the ordinal rows/values for the comparison.

Gordon Linoff · Accepted Answer

DB2 supports lateral and listagg(). Hence, I think you can do:

select gt.*
from (select gt.*, count(*) over (partition by v.cols) as cnt
      from graph_table gt, lateral
           (select listagg(col, ',') within group (order by col) as cols
            from (values (gt.column1), (gt.column2), (gt.column3), (gt.column4)
                 ) v(col)
           ) v
     ) gt
where cnt >= 2;

SQL Query to find duplicate data across columns

Answers (1)

Related Questions