Reputation: 19
Let's say there are 2 tables.
Table 1 : match
Table 2 : normalized
MATCH TABLE : COLUMN NAME : UUID
NORMALIZED TABLE : COLUMN NAMES : UUID, SSN, DOB
In MATCH table there is a column "UUID" and in NORMALIZED table there are columns 'UUID', SSN and DOB. After joining these two tables, need to fetch the uuid which are having unique SSN and DOB
Need to fetch the matched UUIDs records on the below tables by joining 2 tables
MATCH TABLE : UUID Column
Record # | uuid |
---|---|
1 | 123456789 |
2 | 123456789 |
3 | 123456789 |
4 | 123456789 |
5 | 998764556 |
6 | 112233445 |
NORMALIZED TABLE : UUID, SSN, DOB columns
Record # | uuid | ssn | dob | comment |
---|---|---|---|---|
1 | 123456789 | 54321 | 03/08/1986 | SSN is repeated in record #3 |
2 | 123456789 | 98765 | 03/08/1986 | DOB is repeated in record #1 |
3 | 123456789 | 54321 | 03/08/1986 | SSN is repeated in record #1 |
4 | 123456789 | 77654 | 10/10/2000 | SSN is repeated in record #5 |
5 | 998764556 | 77654 | 11/11/1960 | SSN is repeated in record #4 |
6 | 112233445 | 66546 | 07/14/1970 | SSN and DOB is unique |
OUTPUT : Need to display "All UUID's which have SSN and DOB are unique".
Record # | uuid |
---|---|
6 | 112233445 |
This record should display in the output as this UUID has both SSN and DOB are unique from the table.
I have tried with the below queries:
Scenario 1 : Find all UUIDs with Unique SSN
QUERY : select * from MATCH_TABLE where UUID in (select DISTINCT UUID from NORMALIZED_TABLE A,MATCH_TABLE B where A.uuid=B.uuid and A.SSN is not null AND A.SSN in (SELECT C.SSN from NORMALIZED_TABLE C GROUP BY C.SSN having count(1)<2)
Scenario 2 : Find all UUIDs with Unique DOB
QUERY: select * from MATCH_TABLE where UUID in (select DISTINCT UUID from NORMALIZED_TABLE A,MATCH_TABLE B where A.uuid=B.uuid and A.DOB is not null AND A.DOB in (SELECT C.DOB from NORMALIZED_TABLE C GROUP BY C.DOB having count(1)<2)
Scenario 3 : Find all UUIDs with Unique SSN and DOB.
QUERY: select * from MATCH where UUID in (select distinct a.uuid from NORMALIZED a,MATCH b where a.uuid=b.uuid and a.ssn is not null and a.dob is not null and CONCAT(a.ssn,a.dob) in (select concat(c.ssn,c.dob) from NORMALIZED c group by c.ssn,c.dob having count(1)<2))
I'm not getting the correct output. Can anyone please help me with the query.
Upvotes: 1
Views: 383
Reputation: 173046
Another solution would be
select * from (
select as value any_value(t) from `project.dataset.table` t
group by ssn having count(1) = 1
) intersect distinct
select * from (
select as value any_value(t) from `project.dataset.table` t
group by dob having count(1) = 1
)
if applied to sample data in your question - output is
Upvotes: 1
Reputation: 10212
Try windowed count(*) over(partition by column_name)
:
with mytable as (
select 1 as record, 123456789 as uuid, 54321 as ssn, '03/08/1986' as dob union all
select 2, 123456789, 98765, '03/08/1986' union all
select 3, 123456789, 54321, '03/08/1986' union all
select 4, 123456789, 77654, '10/10/2000' union all
select 5, 998764556, 77654, '11/11/1960' union all
select 6, 112233445, 66546, '07/14/1970'
)
select *
from (
select
*,
count(*) over (partition by ssn) as ssn_count,
count(*) over (partition by dob) as dob_count
from mytable
)
where ssn_count = 1 and dob_count = 1
Upvotes: 0