Saafi
Saafi

Reputation: 25

Query to fetch minimum id from a table with duplicate records

There is a tableA.

Source Legacy_Address_ID City State_Code Pincode ADR_LINE_1
C T173005029 LANCASTER PA 17602 N LIME ST
S 84191 LANCASTER PA 17602 N LIME ST
S 10020 SAINT CLAIR SHORES MI 48081 MILE RD
C 85230429169 SAINT CLAIR SHORES MI 48081 MILE RD
C 8770211698 JOLIET IL 60434 PO BOX 666
S 9348710 JOLIET IL 60434 PO BOX 666

The records in the table are basically having the same City,state, pin, adr_line_1 but different Legacy_Address_IDs. I want the output as below :

Legacy_Address_ID New_Legacy_Address_ID
T173005029 84191
85230429169 10020
8770211698 9348710

Basically the min address id(having source S) as new_legacy_Address_id and the larger (having source C) as legacy_address_id.

I tried

select * 
from 
    (select 
         legacy_address_id, 
         min(legacy_address_id) over (partition by adr_line_1, city, state_code, pincode 
                                      order by legacy_address_id) as new_legacy_address_id 
     from 
         tableA)
where 
    legacy_address_id <> new_legacy_address_id;

This query returns as output:

Legacy_Address_ID New_Legacy_Address_ID
T173005029 84191
85230429169 10020
9348710 8770211698

The last record is not returned properly. Can someone please help me out?

Upvotes: 0

Views: 360

Answers (1)

MT0
MT0

Reputation: 167867

Use a PIVOT:

SELECT Legacy_Address_ID,
       New_Legacy_Address_ID
FROM   tablea
PIVOT (
  MAX(legacy_address_id)
  FOR source IN (
    'C' AS Legacy_Address_ID,
    'S' AS New_Legacy_Address_ID
  )
)

or, conditional aggregation:

SELECT MIN(CASE source WHEN 'C' THEN Legacy_Address_ID END) AS Legacy_Address_ID,
       MIN(CASE source WHEN 'S' THEN Legacy_Address_ID END) AS New_Legacy_Address_ID
FROM   tablea
GROUP BY City, State_Code, Pincode, ADR_LINE_1

or, using conditional aggregation with analytic functions:

select * 
from   (
  select legacy_address_id, 
         min(CASE source WHEN 'S' THEN legacy_address_id END) over (
           partition by adr_line_1, city, state_code, pincode
         ) as new_legacy_address_id,
         source
  from   tableA
)
where  source = 'C';

Which, for your sample data:

CREATE TABLE tablea (Source, Legacy_Address_ID, City, State_Code, Pincode, ADR_LINE_1 ) AS
SELECT 'C', 'T173005029',  'LANCASTER',          'PA', 17602, 'N LIME ST'  FROM DUAL UNION ALL
SELECT 'S', '84191',       'LANCASTER',          'PA', 17602, 'N LIME ST'  FROM DUAL UNION ALL
SELECT 'S', '10020',       'SAINT CLAIR SHORES', 'MI', 48081, 'MILE RD'    FROM DUAL UNION ALL
SELECT 'C', '85230429169', 'SAINT CLAIR SHORES', 'MI', 48081, 'MILE RD'    FROM DUAL UNION ALL
SELECT 'C', '8770211698',  'JOLIET',             'IL', 60434, 'PO BOX 666' FROM DUAL UNION ALL
SELECT 'S', '9348710',     'JOLIET',             'IL', 60434, 'PO BOX 666' FROM DUAL;

All output:

LEGACY_ADDRESS_ID NEW_LEGACY_ADDRESS_ID
8770211698 9348710
T173005029 84191
85230429169 10020

db<>fiddle here

Upvotes: 1

Related Questions