user2194253
user2194253

Reputation: 11

Query Performance while using Oracle aggregate function

Below is the query in which I am using an aggregate function. The where clause is simple with an index on corpId and incoming_date . If I simply fetch all rows/count the query takes lesser than a second. However when I use the aggregate function the query takes about 4 minutes. I am using oracle 11i and the total rows the where clause retrieves is around 64000. The table and index statistics have also been gathered recently and there are no new rows added in the table.

Please suggest on improving the speed.

SELECT
sum(paid_amt) totalamount
FROM test_table e
WHERE  e.corpId =6
AND e. incoming_date >= to_date('01-12-2012','dd-mm-yyyy')
AND e. incoming _date <= to_date('09-01-2013','dd-mm-yyyy')

Upvotes: 1

Views: 6086

Answers (2)

Dave
Dave

Reputation: 151

I recreated this scenario like this...

CREATE TABLE test_table 
  ( id integer
  , corpId integer
  , paid_amt number(10,2)
  , incoming_date DATE );

ALTER TABLE test_table
add CONSTRAINT test_table_pk PRIMARY KEY (id);

create index test_table_nui_1 on test_table(corpId);

create index test_table_nui_2 on test_table(incoming_date);

create sequence test_table_seq;

insert into test_table
  select test_table_seq.nextval
       ,MOD(test_table_seq.currval,6)
       ,MOD(test_table_seq.currval,10) + 1
       ,sysdate - MOD(test_table_seq.currval,200) 
  from all_objects, user_objects;  

The cartesian join between all_objects and user_objects is just a hack to get a load of records inserted quickly. (657,000 rows in this case)

Running through the selects first all 657,000...

select sum(paid_amt)
from test_table;

Plan SELECT STATEMENT ALL_ROWSCost: 621 Bytes: 13 Cardinality: 1
2 SORT AGGREGATE Bytes: 13 Cardinality: 1
1 TABLE ACCESS FULL TABLE DAVE.TEST_TABLE Cost: 621 Bytes: 9,923,914 Cardinality: 763,378

Then 109,650 for one corpId...

select sum(paid_amt)
from test_table
where corpId = 5;

Plan SELECT STATEMENT ALL_ROWSCost: 265 Bytes: 26 Cardinality: 1
3 SORT AGGREGATE Bytes: 26 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID TABLE DAVE.TEST_TABLE Cost: 265 Bytes: 3,310,138 Cardinality: 127,313
1 INDEX RANGE SCAN INDEX DAVE.TEST_TABLE_NUI_1 Cost: 213 Cardinality: 3,054

And finally 20,836 rows restricting by date...

SELECT sum(paid_amt) totalamount
FROM test_table e
WHERE  e.corpId = 5
AND e. incoming_date >= to_date('01-12-2012','dd-mm-yyyy')
AND e. incoming_date <= to_date('09-01-2013','dd-mm-yyyy')

Plan SELECT STATEMENT ALL_ROWSCost: 265 Bytes: 35 Cardinality: 1
3 SORT AGGREGATE Bytes: 35 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID TABLE DAVE.TEST_TABLE Cost: 265 Bytes: 871,360 Cardinality: 24,896
1 INDEX RANGE SCAN INDEX DAVE.TEST_TABLE_NUI_1 Cost: 213 Cardinality: 3,054

All 3 queries were rapid (i.e. < 0.5 seconds)

An alternative would be to drop nui_1 and nui_2 and create a combined index on both columns. This then ran in 31ms on my db

create index test_table_nui_3 on test_table(corpId, incoming_date);

Plan SELECT STATEMENT ALL_ROWSCost: 15 Bytes: 35 Cardinality: 1
3 SORT AGGREGATE Bytes: 35 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID TABLE DAVE.TEST_TABLE Cost: 15 Bytes: 871,360 Cardinality: 24,896
1 INDEX RANGE SCAN INDEX DAVE.TEST_TABLE_NUI_3 Cost: 3 Cardinality: 14

This suggests that the aggregate function is not the problem, but your indexing may be. The best thing to do would be to check your explain plans.

Upvotes: 0

Quassnoi
Quassnoi

Reputation: 425623

Include paid_amt into the index:

CREATE INDEX
        ix_testtable_cord_date_paid
ON      test_table (corpId, incoming_date, paid_amt)

If you have an index on just (corpId, incoming_date) and try to test speed like this:

SELECT  COUNT(*)
FROM    test_table
WHERE   e.corpId = 6
        AND e.incoming_date >= to_date('01-12-2012','dd-mm-yyyy')
        AND e.incoming_date <= to_date('09-01-2013','dd-mm-yyyy')

you not querying for anything outside the index so the query can be satisfied with an INDEX (RANGE SCAN) alone.

As soon as you add anything not in the index (paid_amt in your case), the query needs to to use additional TABLE ACCESS (BY INDEX ROWID) to retrieve the record from the table.

It's random lookups in a nested loop and it's slow, especially if your table records are large (have lots of fields or long fields).

The optimizer may even deem this access method less efficient that the FULL SCAN and use the latter instead.

Upvotes: 3

Related Questions