Reputation: 85
I have a scenario where customer data is created in one schema and pushing to another schema for business operations. In some cases, the data is not pushed so I am using the below query to identify missing data. But it is kept on running, not returning the data as both tables are having more than 5 million records. How to optimize and get the result?
SELECT CUSTID FROM SCHEMA1.CUSTOMER WHERE CUSTID NOT IN(SELECT ID FROM SCHEMA2.CUST_TBL);
After adding one more filter, the query response is same.
SELECT CUSTID FROM SCHEMA1.CUSTOMER WHERE CUSTID NOT IN(SELECT ID FROM SCHEMA2.CUST_TBL WHERE CUST_TYPE=0);
CUSTID and ID are indexed, unique, and not null columns. CUST_TYPE is not null but not indexed.
Also tried with /*+parallel(50)*/
Oracle version: 19c
Upvotes: 0
Views: 273
Reputation: 4416
Minus requires sorting, so is not very efficient. A better approach would be an outer join:
select custid
from SCHEMA1.CUSTOMER c1
left join SCHEMA2.CUST_TBL c2
on c1.custid = c2.id
where c2.id is null
If you have enough resources (CPU, IO, memory), you can use parallelism to improve the performance further.
Upvotes: 1
Reputation: 2210
Have you tried not exist.
SELECT CUSTID FROM SCHEMA1.CUSTOMER A WHERE NOT EXIST
(SELECT 1 FROM SCHEMA2.CUST_TBL B WHERE A.ID = B.ID );
Upvotes: 1
Reputation: 142720
How about the minus
set operator?
select custid from schema1.customer
minus
select id from schema2.cust_tbl;
Upvotes: 0