Reputation: 466
The values of interest are an EIN
and a registration number REG
.
There are lots of records for each value.
What I want to know is which pair of values only appears with each other (or a blank EIN
).
ID EIN REG
12 321 124
13 321 125
14 322 168
15 322 168
16 323 171
17 323 171
18 323
So in the above example, I'd want to select 322 and 168, as every time they appear, they appear together, but I also want to select 323 and 171, since they never appear with another value, just a non-value. 100% of records have an EIN, but a smaller portion have a Registration ID.
Any suggestions on how to query for this?
Upvotes: 1
Views: 148
Reputation: 659137
One of many possible ways:
SELECT DISTINCT ein, reg
FROM test t
WHERE reg is NOT NULL
AND NOT EXISTS (
SELECT 1 FROM test t1
WHERE t1.ein = t.ein AND t1.reg <> t.reg
OR t1.reg = t.reg AND t1.ein <> t.ein);
ein | reg
----+-----
323 | 171
322 | 168
EIN
is NOT NULL
, but REG
can be NULL
, since you wrote:100% of records have an EIN, but a smaller portion have a Registration ID.
NULL
never qualifies with the operator <>
, so (323, 171)
is not excluded.
I exclude rows with reg IS NULL
from the result, that seems to be your intention. So
is not included.(323, NULL)
Upvotes: 3
Reputation: 4703
I got the results you were looking for using this query:
with cte as
(
select ein, count(distinct reg) as occ
from test b
group by ein
having count(distinct reg) = 1
)
select distinct t.ein, t.reg
from cte c
join test t
on c.ein = t.ein;
However, I'm sure there are other ways to do this.
Upvotes: 5