Manu
Manu

Reputation: 269

Oracle partitioned table query cost vs non-partitioned table query cost

I have a table PO_HEADER with ~20 million records. Considering our future load on the table we have decided to partitioned the table to increase the performance of the sql queries. Below are the queries used to create the new partitioned tables.

CREATE TABLE PO_HEADER_LP 
 PARTITION BY LIST (BUYER_IDENTIFIER)
(PARTITION GC66287246AA VALUES ('GC66287246AA') TABLESPACE MITRIX_TABLES,
PARTITION GC43837235JK VALUES ('GC43837235JK') TABLESPACE MITRIX_TABLES,
PARTITION GC84338293AA VALUES ('GC84338293AA') TABLESPACE MITRIX_TABLES,
PARTITION DEFAULTBUID VALUES (DEFAULT) TABLESPACE MITRIX_TABLES) 
AS SELECT * 
   FROM PO_HEADER;

create index PO_HEADER_LP_SI_IDX on PO_HEADER_LP("SUPPLIER_IDENTIFIER") TABLESPACE MITRIX_INDEXES LOCAL;

Old Table PO_HEADER has two indexes on "BUYER_IDENTIFIER" and "SUPPLIER_IDENTIFIER" columns as follows:

create index PO_HEADER_BI_IDX on  PO_HEADER("BUYER_IDENTIFIER") TABLESPACE MITRIX_INDEXES;
create index PO_HEADER_SI_IDX on  PO_HEADER("SUPPLIER_IDENTIFIER") TABLESPACE MITRIX_INDEXES;

To test the performance of the query, I executed below query on both the tables. But, to my wonder I saw the cost of the 2nd query is almost double than the 1st one. Can any body know, why is the query cost is high of the partitioned table compared to normal table. Thanks in Advance.

select * from po_header where buyer_identifier='GC84338293AA' and supplier_identifier='GC75987723HT'; --cost: 56,941
select * from po_header_lp where buyer_identifier= 'GC84338293AA' and supplier_identifier='GC75987723HT'; --cost: 93,309

PO_HEADER with Global Index on buyer_identifier & supplier_identifier column PO_HEADER with Global Index on buyer_identifier & supplier_identifier column

PO_HEADER_LP with Global Index on supplier_identifier column PO_HEADER_LP with Global Index on supplier_identifier column

PO_HEADER_LP with Local Index on supplier_identifier column PO_HEADER_LP with Local Index on supplier_identifier column

Upvotes: 2

Views: 1506

Answers (2)

Marmite Bomber
Marmite Bomber

Reputation: 21063

From your DDL I assume, you have three big buyers (say 5M records each) and a bunch of smaller ones. In other word this would be the correct setup for you list partitioning schema.

You may verify, whether it works testing access on buyer only:

EXPLAIN PLAN  SET STATEMENT_ID = 'jara1' into plan_table  FOR
select * from tab_lp where BUYER_ID = 1;
;  
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));

------------------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |  6662K|    82M|  4445   (2)| 00:00:01 |       |       |
|   1 |  PARTITION LIST SINGLE|        |  6662K|    82M|  4445   (2)| 00:00:01 |   KEY |   KEY |
|   2 |   TABLE ACCESS FULL   | TAB_LP |  6662K|    82M|  4445   (2)| 00:00:01 |     2 |     2 |
------------------------------------------------------------------------------------------------

The same query for the non-partitioned table should produce much higher cost. Why? In the partitioned table the selected buyer (in your case GC84338293AA, I'm using surrogate keys) has it own partition. So full scan of this partition is the best access.

select * from tab where BUYER_ID = 1;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  6596K|    81M| 14025   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TAB  |  6596K|    81M| 14025   (1)| 00:00:01 |
--------------------------------------------------------------------------

   1 - filter("BUYER_ID"=1)

For the non-partitioned table (to get approximately one fourth of the data) the FULL TABLE SCAN is OK as well, but of course has higher cost as all data must be scanned.

Note - if you see here lower cost, unrealistically low Rows count and/or INDEX ACCESS, than this is the cause of the problem of the underestimating of the cost. So don't worry the old cost are too low, not the new one too high!

The next step is the access on both buyer and supplier. To get the answer you must provide additional information.

How selective is the supplier filter?

I.e. if the predicate buyer_identifier='GC84338293AA' returns say 5M records, how may records return the predicate with both columns?

buyer_identifier='GC84338293AA' and supplier_identifier='GC75987723HT'

Is it 4M or 100 records?

If the complete predicate returns only few records than the local index on supplier is OK.

If it returns large number of rows (say the quarter of the partition) - you should stay on FULL PARTITION SCAN and not use it. This is similar to my comment on the non partitioned table.

Estimation of the supplier cardinality

In case that the column SUPPLIER contains a skewed data (which may fool the CBO to calulate improper cost) you may define explicitely histogram in this column.

I used this statement statement, that calculates the histogram on full data (100% is important for highly skewed data) and for the table and partition.

exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'TAB_LP',granularity=>'all',estimate_percent => 100,METHOD_OPT => 'for columns SUPPLIER_ID size 254');

This worked for my test data, i.e. for supplier with low cardinality an index access was opened (on local no-prefixed index) and for huge suppliers a full partition scan was used.

Upvotes: 1

Md. Shamim Al Mamun
Md. Shamim Al Mamun

Reputation: 386

You can create a Local partitioned index using this script.

CREATE INDEX PO_HEADER_LOCAL_IDX ON PO_HEADER_LP
(BUYER_IDENTIFIER, SUPPLIER_IDENTIFIER)
LOCAL (
       PARTITION GC66287246AA,  
       PARTITION GC43837235JK,  
       PARTITION GC84338293AA,  
       PARTITION DEFAULTBUID
      );

Also it is recommended to gather statistics of the newly created partition table using this script:

EXEC DBMS_STATS.GATHER_TABLE_STATS('SCHEMA Name','PO_HEADER_LP');

Now you can generate the execution plan again of the following SQL:

select * from po_header_lp where buyer_identifier= 'GC84338293AA' and supplier_identifier='GC75987723HT';

Hope this will help you.

Upvotes: 0

Related Questions