Francesco Dassisis
Francesco Dassisis

Reputation: 142

filter out rows if one column value is associated with only one single value in another column

A   B   C   D   E
1981    a   b   CY3 2
1981    c   l   CY3 1
1981    f   r   CY3 5
1255    ee  ee  CY3 1
1255    ff  ff  CY3 1
1387    g   g   CY5 2
1387    h   h   CY5 10
1387    P   h   CY5 C7

I am getting this table with the following query (ORACLE pl/sql):

SELECT A,B,C,D,COUNT(*) AS E
FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
             INNER JOIN TAB3 t3 ON t1.A = t3.C
GROUP BY A,B,C,D
ORDER BY D ASC, A DESC;

And I would like to get the following result by editing the above query:

A   B   C   D   E
1981    a   b   CY3 2
1981    c   l   CY3 1
1981    f   r   CY3 5
1255    ee  ee  CY3 1
1255    ff  ff  CY3 1

I would like to filter out the last three rows because the column D has one value (i.e. CY5) which is associated with only one single value in column A (i.e. 1387) whereas CY3 is associated with two different values (i.e. 1981 & 12550 so I wan to keep this).

Can anyone help me or point me to a similar question please?

Upvotes: 0

Views: 126

Answers (4)

Amir Kadyrov
Amir Kadyrov

Reputation: 1288

with s (a, b, c, d, e) as (
select 1981, 'a'  , 'b' , 'CY3', 2  from dual union all
select 1981, 'c'  , 'l' , 'CY3', 1  from dual union all
select 1981, 'f'  , 'r' , 'CY3', 5  from dual union all
select 1255, 'ee' , 'ee', 'CY3', 1  from dual union all
select 1255, 'ff' , 'ff', 'CY3', 1  from dual union all
select 1387, 'g'  , 'g' , 'CY5', 2  from dual union all
select 1387, 'h'  , 'h' , 'CY5', 10 from dual union all
select 1387, 'P'  , 'h' , 'CY5', 17 from dual)
select a, b, c, d, e
from 
    (select s.*, count(distinct a) over (partition by d) cnt_dict
     from s
    )
where cnt_dict > 1;

         A B  C  D            E
---------- -- -- --- ----------
      1255 ee ee CY3          1
      1255 ff ff CY3          1
      1981 f  r  CY3          5
      1981 c  l  CY3          1
      1981 a  b  CY3          2

Upvotes: 0

dnoeth
dnoeth

Reputation: 60462

Most DBMSes support Windowed Aggregates:

select *
from
 (
    SELECT A,B,C,D,COUNT(*) AS E,
       MIN(A) OVER (PARTITION BY D) AS minA,-- minimum A for all rows with the same D
       MAX(A) OVER (PARTITION BY D) AS maxA -- maximum A for all rows with the same D
    FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
                 INNER JOIN TAB3 t3 ON t1.A = t3.C
    GROUP BY A,B,C,D
 ) as dt
where minA <> maxA -- there must be at least 2 different values (usually cheaper than COUNT(DISTINCT)
ORDER BY D ASC, A DESC;

Edit:

For Oracle and as you want to return the number of distinct values:

select *
from
 (
    SELECT A,B,C,D,COUNT(*) AS E,
       COUNT(DISTINCT A) OVER (PARTITION BY D) AS countA
    FROM TAB1  t1 INNER JOIN TAB2 t2 ON t1.A = t2.B
                 INNER JOIN TAB3 t3 ON t1.A = t3.C
    GROUP BY A,B,C,D
 ) dt -- you don't need the alias in Oracle, but STandard SQL requires it
where countA > 1
ORDER BY D ASC, A DESC;

Upvotes: 1

MD AZAD HUSSAIN
MD AZAD HUSSAIN

Reputation: 212

Have a look!

    DECLARE @TEST AS TABLE
(A VARCHAR(100),B VARCHAR(100),C VARCHAR(100),D VARCHAR(100))

INSERT INTO @TEST VALUES
('1981','A','B','CY3'),
('1981','A','B','CY3'),
('1981','C','L','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1981','F','R','CY3'),
('1255','EE','EE','CY3'),
('1255','FF','FF','CY3'),
('1387','G','G','CY5'),
('1387','G','G','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','H','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5'),
('1387','P','H','CY5')

SELECT DATA.* FROM 
(
SELECT T.D,COUNT(T.A) AS DISTINCT_RECORD FROM (
SELECT DISTINCT D,A FROM @TEST) T
GROUP BY T.D
HAVING COUNT(T.A)>1
) 

CRITERIA LEFT JOIN 

(
SELECT A,B,C,D,COUNT(*) AS E
FROM @TEST
GROUP BY A,B,C,D
) 

DATA ON CRITERIA.D=DATA.D

Upvotes: 0

Fahmi
Fahmi

Reputation: 37473

use correlated subquery using exists

SELECT A,B,C,D,COUNT(*) AS E
FROM TABLESS t1
where exists (select 1 from TABLESS t2 where t1.D=t2.D having count(A)>1)
GROUP BY A,B,C,D
ORDER BY D ASC, A DESC

Upvotes: 0

Related Questions