Reputation: 63
We have a load plan in ODI. We get a repeating error in some on our scenarios due to duplicate records in the I% table. What we do is manually run the script every time the load plan fails.
DELETE FROM adw12_dw. I$_1558911580_4
WHERE (EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID) IN
( SELECT EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
FROM adw12_dw . I$_1558911580_4
GROUP BY EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
HAVING COUNT (1) > 1)
AND ROWID NOT IN
( SELECT MIN (ROWID)
FROM adw12_dw . I$_1558911580_4
GROUP BY EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
HAVING COUNT (1) > 1)
commit;
Is there a way to automate the deletion of duplicate records in the Integration table?
Upvotes: 0
Views: 795
Reputation: 3303
Maybe your duplicate counts over 2 because we must execute the delete query recursively. For example:
CREATE OR REPLACE PROCEDURE delete_dublicates
IS
BEGIN
DELETE FROM TABLE1 WHERE ID IN
(
SELECT max(ID) FROM TABLE1
GROUP BY USER_ID, TYPE_ID
HAVING count(*) > 1
);
IF (SQL%ROWCOUNT > 0) THEN
delete_dublicates;
END IF;
END delete_dublicates;
Upvotes: 0
Reputation: 1928
If you have duplicates in the source, best would be to handle that in the logic of the mapping.
What could work is to add an expression component to add a row_rank
column using an analytical function to rank the duplicates : row_number() over (partition by EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID order by ROWID)
.
You can then add a filter with the condition row_rank = 1
.
If you prefer to do a delete after inserting, you can edit the IKM and add the delete step before loading the target table.
You could also divide the integration in 3 different steps :
Upvotes: 1