Reputation: 53
I'm designing a UNION query to merge two tables with customer informations in a oracle 11g database. The first table a is the 'major' source the second table b is a additional source with new and duplicate entries.
The duplicates in b can not be eliminated by using UNION in fact of not equal fields like the autoincremented ID which is necessary to select.
Table a
ID CUSTOMER_NUMBER NAME STREET
1 4711 Dirk Downstreet 4
2 4721 Hans Mainstreet 5
Table b
ID CUSTOMER_NUMBER NAME STREET
44 4711 Dirk Downstreet 4 <== Duplicate
4 4741 Harry Crossroad 9 <== new
Expected Result
ID CUSTOMER_NUMBER NAME STREET DATASOURCE
1 4711 Dirk Downstreet 4 SAP <== from a
2 4721 Hans Mainstreet 5 SAP <== from a
4 4741 Harry Crossroad 9 MANUAL <== from b
I'm pretty happy with the following - simplified - trial:
SELECT CUSTOMER_NUMBER,
MAX(ID) KEEP (DENSE_RANK FIRST ORDER BY DATASOURCE DESC) ID,
MAX(NAME) KEEP (DENSE_RANK FIRST ORDER BY DATASOURCE DESC) NAME,
MAX(STREET) KEEP (DENSE_RANK FIRST ORDER BY DATASOURCE DESC) STREET,
FROM
(SELECT "ID","CUSTOMER_NUMBER","NAME","STREET", 'SAP' as DATASOURCE FROM CUSTOMERS
UNION ALL
SELECT "ID","CUSTOMER_NUMBER","NAME","STREET", 'MANUAL' as DATASOURCE FROM CUSTOMERS_MANUAL) united
group by CUSTOMER_NUMBER
But I have to SELECT every single field by DENSE_RANK FIRST ORDER BY DATASOURCE DESC which are about 20 fields ...
Can anyone show me a better approch?
Upvotes: 1
Views: 54
Reputation: 21043
The alternative to KEEP
for each column is to use ROW_NUMBER
with the partitioning on your unique key and propper order and select only the row with the number one.
Example for CUSTOMER_NUMBER
as unique key, prefering MANUAL
over SAP
and expecting the ID
is unique in each source.:
SELECT * FROM
(
SELECT
"ID","CUSTOMER_NUMBER","NAME","STREET",
roww_number() over (partition by CUSTOMER_NUMBER order by decode(DATASOURCE,'SAP',2,'MANUAL',1), ID) as RN
FROM
(SELECT "ID","CUSTOMER_NUMBER","NAME","STREET", 'SAP' as DATASOURCE FROM CUSTOMERS
UNION ALL
SELECT "ID","CUSTOMER_NUMBER","NAME","STREET", 'MANUAL' as DATASOURCE FROM CUSTOMERS_MANUAL) united
) WHERE RN = 1
This works fine even in case that the individual sources deliver duplicates. Adjust the order columns so, that the query remains deterministic, i.e. repeated query provides the same result (e.g. add NAME
if the ID
column can be duplicated in SAP
)
Upvotes: 2