Reputation: 432

How to make the 'not exist' faster

I have a problem. My DB is Oracle 9. Here is my SQL:

SELECT COUNT(distinct A.person_id)
        INTO p_record_count
        FROM A,
             B,
             F,
             T
       WHERE B.position_id = F.position_id **--Normal bussiness association**
         AND T.field2 = A.org_id           **--Normal bussiness association**
         AND F.Organization_Id = A.org_id  **--Normal bussiness association**
         AND A.person_id = F.person_id     **--Normal bussiness association**
         AND F.primary_flag = 'Y'          **--Normal bussiness association**
         and sysdate between F.effective_start_date and
             F.effective_end_date          **--Normal bussiness association**
         AND NOT EXISTS                    --very slow
                (SELECT log.person_id
                FROM cper.ehr_access_log log
               WHERE log.person_id = A.person_id
                 AND log.access_page = 'login.do'
                 AND trunc(log.access_time) BETWEEN
                     to_date(p_startDate, 'yyyy-mm-dd') AND
                     to_date(p_endDate, 'yyyy-mm-dd'))
         AND A.enable_flag = 'Y';            **--Normal bussiness association**

Requirement: Get people who didn't login.

The logic: Get all people I can get, then minus the people who login.

I have only a table with the login records.

Table cper.ehr_access_log has more than 10M records. It is a log table. This SQL takes about 30seconds.

Within 5s is possible? Thanks for your help.

Thank you~!I will try.Have a good weekend~ : )

Upvotes: 1

Answers (4)

Miserable Variable

Reputation: 28752

Table cper.ehr_access_log has more than 10M records.It is a log table.This sql takes about 30seconds. Within 5s is possible?

I would say it is not possible without adding some indexes. But you might want to try the following suggestions

Use a query plan to identify the hotspot. That will likely be tablescan required for ehr_access_log and you can probably use that argument to convince others that you need the index.
Try replacing subquery with outer join as others have suggested
You have joins on B and T but not predicates on them. Do you really need them in the query?
You might be able to save some time by avoiding trunc calls (not sure). As an alternative, modify p_startDate beginning of day and p_endDate to end of day

Upvotes: 3

nonsleepr

Reputation: 811

I think you can optimize you query by selecting only DISTINCT log.person_id records, it will decrease amount of records from your log.

You can also try to make an EXCLUDE from the first part:

SELECT COUNT(distinct A.person_id)
    INTO p_record_count
    FROM A,
         B,
         F,
         T
   LEFT OUTER JOIN cper.ehr_access_log log
     ON log.person_id = A.person_id
        AND log.access_page = 'login.do'
        AND trunc(log.access_time) BETWEEN
        to_date(p_startDate, 'yyyy-mm-dd') AND
        to_date(p_endDate, 'yyyy-mm-dd')
   WHERE B.position_id = F.position_id **--Normal bussiness association**
     AND T.field2 = A.org_id           **--Normal bussiness association**
     AND F.Organization_Id = A.org_id  **--Normal bussiness association**
     AND A.person_id = F.person_id     **--Normal bussiness association**
     AND F.primary_flag = 'Y'          **--Normal bussiness association**
     and sysdate between F.effective_start_date and
         F.effective_end_date          **--Normal bussiness association**
     AND log.person_id IS null
     AND A.enable_flag = 'Y';            **--Normal bussiness association**

Upvotes: 1

Siva Charan

Reputation: 18064

For a better performance, you must INDEX the table cper.ehr_access_log

Since it is a LOG table, make a monthly schedule to INDEX this table based on person_id. This will reduce some amount of time in quering.

Upvotes: 1

Ed Guiness

Reputation: 35227

The 'not exists' part of your query is known as a correlated subquery. You can generally get better performance if you re-write as an OUTER join.

The pattern is

select a from b where not exists (select 1 from c where b.a = c.a)

becomes

select a from b left outer join c on b.a = c.a where c.a is null

Upvotes: 2

How to make the &#39;not exist&#39; faster

Within 5s is possible? Thanks for your help.

Answers (4)

Related Questions

How to make the 'not exist' faster