arun
arun

Reputation: 791

Query to find Distinct rows within the table values

I have a table structure like this example and I wanted to identify the records which have different values in Col1 for the same value in Col2. Example b2, c3, d4 below.

Table1

Col1            Col2
a1              x
a1              y
a1              z
b2              x
c3              x
d4              z
d4              z

Upvotes: 0

Views: 52

Answers (3)

Ted
Ted

Reputation: 77

How about something more simple:

SELECT COL2, COUNT(*) FROM table1 GROUP BY COL2 HAVING COUNT(*)>1

This will show you that "x" and "z" each have 3 records in the table.

Or, a little extension:

SELECT * FROM table1 
  WHERE COL2 IN (SELECT COL2 FROM table1 GROUP BY COL2 HAVING COUNT(*)>1)

which shows all rows with non-distinct COL2.

After re-reading the question a few times maybe this is what you have in mind:

SELECT * FROM tabele1 AS T1
  WHERE EXISTS(SELECT 1 FROM table1 AS T2 WHERE T1.COL2=T2.COL2 AND T1.COL1<>T2.COL2)

This finds all rows where another row with the same COL2 but different COL1 exists.

So, if you compare it to your example, it also finds the "a1" rows. But if you don't want to have the "first" occurrence some order criterion is needed. If you define "first" as "first in string ordering" the "T1.COL1<>T2.COL2" clause could be modified to "T1.COL1>T2.COL2".

Note that this statement won't find duplicate lines. So, if you remove the a1/z row from your example, the duplicate d4/z lines would not be returned. If those are also needed a UNION may do the trick:

SELECT COL1, COL2 FROM tabele1 AS T1
  WHERE EXISTS(SELECT 1 FROM table1 AS T2 WHERE T1.COL2=T2.COL2 AND T1.COL1<>T2.COL2)
UNION SELECT COL1, COL2 FROM table1 GROUP BY COL1, COL2 HAVING COUNT(*)>1

This reports duplicate lines in the result once, unlike the previous statement. If you also need an indication which of the rows occur more then once I'd do it as:

SELECT COL1, COL2, NUM FROM (
  SELECT COL1, COL2, 1 AS NUM FROM tabele1 AS T1
    WHERE EXISTS(SELECT 1 FROM table1 AS T2 WHERE T1.COL2=T2.COL2 AND T1.COL1<>T2.COL2)
  UNION SELECT COL1, COL2, COUNT(*) AS NUM FROM table1 GROUP BY COL1, COL2 HAVING COUNT(*)>1
) GROUP BY COL1, COL2

Does one of those match your requirements?

FYI: I verified the syntax of these statements using Oracle 12, but IMHO this should be standard SQL...

Upvotes: 1

EJ Egyed
EJ Egyed

Reputation: 6094

You can use a query like this one to find all the combinations you are looking for

Query 1

WITH
    table1 (col1, col2)
    AS
        (SELECT '111', 'x' FROM DUAL
         UNION ALL
         SELECT '111', 'y' FROM DUAL
         UNION ALL
         SELECT '111', 'z' FROM DUAL
         UNION ALL
         SELECT '222', 'x' FROM DUAL
         UNION ALL
         SELECT '333', 'x' FROM DUAL
         UNION ALL
         SELECT '444', 'z' FROM DUAL
         UNION ALL
         SELECT '444', 'z' FROM DUAL)
  SELECT DISTINCT col1, col2
    FROM table1
   WHERE col2 IN (  SELECT col2
                      FROM table1
                  GROUP BY col2
                    HAVING COUNT (DISTINCT col1) > 1)
ORDER BY col2, col1;

Result 1

   COL1    COL2
_______ _______
111     x
222     x
333     x
111     z
444     z

If you are on Oracle 19, you can use LISTAGG with DISTINCT to get all of the values in a single column

Query 2

WITH
    table1 (col1, col2)
    AS
        (SELECT 111, 'x' FROM DUAL
         UNION ALL
         SELECT 111, 'y' FROM DUAL
         UNION ALL
         SELECT 111, 'z' FROM DUAL
         UNION ALL
         SELECT 222, 'x' FROM DUAL
         UNION ALL
         SELECT 333, 'x' FROM DUAL
         UNION ALL
         SELECT 444, 'z' FROM DUAL
         UNION ALL
         SELECT 444, 'z' FROM DUAL)
  SELECT col2, LISTAGG (DISTINCT col1, ',') WITHIN GROUP (ORDER BY col1) AS distinct_vals
    FROM table1
GROUP BY col2
  HAVING COUNT (DISTINCT col1) > 1;

Result 2

   COL2    DISTINCT_VALS
_______ ________________
x       111,222,333
z       111,444

Upvotes: 1

GMB
GMB

Reputation: 222652

If I follow you correctly, you can do this with window functions:

select *
from (
    select t.*, 
        min(col1) over(partition by col2) min_col1, 
        max(col1) over(partition by col2) max_col1
    from mytable t
) t
where min_col1 <> max_col1

Upvotes: 1

Related Questions