milheiros
milheiros

Reputation: 619

SQL Why don't use PK index?

I have this complicated SQL query:

  SELECT f1 (d1.prdecdde),
         f2 (d1.prdecdde),
         f3 (d1.prdecdde),
         f4 (1, d1.prdecdde, d1.prdenpol),
         d1.prdeisin,
         f6 (d1.prdecdde, a.POLIRCTB),
         NVL (a.poliagtb, a.poliagta),
         d1.prdedtpr,
         prdeticu
    FROM (  SELECT prdecdde,
                   prdenpol,
                   prdeano,
                   SUM (NVL (prdeval, 0)) valantes,
                   NULL valdepois,
                   prdedtpr,
                   prdeticu,
                   prdeisin
              FROM stat_pro_det
             WHERE     prdedprv = '20151101'
                   AND prdecdde IN (700,
                                    100,
                                    610,
                                    600,
                                    710,
                                    900,
                                    910)
                   AND prdeval > 0
          GROUP BY prdecdde,
                   prdenpol,
                   prdeano,
                   prdedtpr,
                   prdeticu,
                   prdeisin
          UNION ALL
            SELECT prdecdde,
                   prdenpol,
                   prdeano,
                   NULL,
                   SUM (NVL (prdeval, 0)) valdepois,
                   prdedtpr,
                   prdeticu,
                   prdeisin
              FROM stat_pro_det
             WHERE     prdedprv = '20160727'
                   AND prdecdde IN (700,
                                    100,
                                    610,
                                    600,
                                    710,
                                    900,
                                    910)
                   AND prdeval > 0
          GROUP BY prdecdde,
                   prdenpol,
                   prdeano,
                   prdedtpr,
                   prdeticu,
                   prdeisin) d1,
         sgss.dtpoli a
   WHERE a.policdde = d1.prdecdde AND a.polinpol = d1.prdenpol
  HAVING SUM (NVL (d1.valdepois, 0) - NVL (d1.valantes, 0)) <> 0
GROUP BY d1.prdecdde,
         d1.prdenpol,
         d1.prdeano,
         a.polirctb,
         a.poliagta,
         a.poliagtb,
         d1.prdedtpr,
         d1.prdeticu,
         d1.prdeisin;

The primary key for dtpoli table is this:

CREATE UNIQUE INDEX SGSS.PK_DTPOLI ON SGSS.DTPOLI
(POLICDDE, POLINPOL)

Here is the explain plan:

Plan hash value: 1960385779

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                 | 19403 |  1705K|       |   113K  (1)| 00:00:05 |
|*  1 |  FILTER                            |                 |       |       |       |            |          |
|   2 |   HASH GROUP BY                    |                 | 19403 |  1705K|    38M|   113K  (1)| 00:00:05 |
|*  3 |    HASH JOIN                       |                 |   388K|    33M|    23M|   111K  (1)| 00:00:05 |
|   4 |     TABLE ACCESS FULL              | DTPOLI          |   618K|    16M|       |  6561   (7)| 00:00:01 |
|   5 |     VIEW                           |                 |   388K|    22M|       |   103K  (1)| 00:00:05 |
|   6 |      UNION-ALL                     |                 |       |       |       |            |          |
|   7 |       HASH GROUP BY                |                 |   194K|  9304K|    13M| 52044   (1)| 00:00:03 |
|   8 |        INLIST ITERATOR             |                 |       |       |       |            |          |
|*  9 |         TABLE ACCESS BY INDEX ROWID| STAT_PRO_DET    |   194K|  9304K|       | 50003   (1)| 00:00:02 |
|* 10 |          INDEX RANGE SCAN          | STAT_PRO_DET_03 |   198K|       |       |   790   (2)| 00:00:01 |
|  11 |       HASH GROUP BY                |                 |   193K|  9264K|    13M| 51818   (1)| 00:00:03 |
|  12 |        INLIST ITERATOR             |                 |       |       |       |            |          |
|* 13 |         TABLE ACCESS BY INDEX ROWID| STAT_PRO_DET    |   193K|  9264K|       | 49784   (1)| 00:00:02 |
|* 14 |          INDEX RANGE SCAN          | STAT_PRO_DET_03 |   197K|       |       |   783   (2)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(SUM(NVL("D1"."VALDEPOIS",0)-NVL("D1"."VALANTES",0))<>0)
   3 - access("POLICDDE"="D1"."PRDECDDE" AND "POLINPOL"="D1"."PRDENPOL")
   9 - filter("PRDEVAL">0)
  10 - access("PRDEDPRV"='20151101' AND ("PRDECDDE"=100 OR "PRDECDDE"=600 OR "PRDECDDE"=610 OR 
              "PRDECDDE"=700 OR "PRDECDDE"=710 OR "PRDECDDE"=900 OR "PRDECDDE"=910))
  13 - filter("PRDEVAL">0)
  14 - access("PRDEDPRV"='20160727' AND ("PRDECDDE"=100 OR "PRDECDDE"=600 OR "PRDECDDE"=610 OR 
              "PRDECDDE"=700 OR "PRDECDDE"=710 OR "PRDECDDE"=900 OR "PRDECDDE"=910))

Both columns are number datatype. Using the hint parallel(#) I can improve the performance but my focus is on dtpoli PK.

I can't find why this query doesn't use this primary key index and uses a Full Table Scan on DTPOLI table. It's because I have a Group by clause? I really don't understand. Any help? I'm using Oracle 11gR2.

Upvotes: 1

Views: 134

Answers (1)

Alex Poole
Alex Poole

Reputation: 191560

It doesn't use the index because it would be less efficient to do so. An index is useful if you're retrieving a small proportion of the data from the table, but when you're getting a lot of it, using the index would be slower.

The reason is that finding the matching row in the index requires a disk access, to get the index block. That gives you the ROWID of the data record, and you then need another disk access to get that data block. Each index and data block has to be read at least once, and possibly many times.

The blocks may be in the buffer cache, but you're still hitting that twice, and because you'd be jumping around to different parts of the index and table you're increasing the likelihood of things having aged out - which means that even if you end up getting two rows from the same physical data block, you might end up having to read it from disk twice.

A full table scan will retrieve all of the data blocks for the table in one go, so it doesn't have to read any of them twice, and doesn't have the additional overhead of also reading the index blocks.

If you were only referring to the columns in the primary key (or any other index) then a full index scan might be used. But you're retrieving non-index data, such as poliagtb, so the data blocks have to be retrieved too.

Your primary key is enforcing referential integrity. It can also be used to retrieve specific data quickly, but only when appropriate. The optimiser does a pretty good job of deciding when it is and isn't appropriate.

Upvotes: 6

Related Questions