Alex
Alex

Reputation: 1232

Query with 'count(*)' runs in a second. Query without 'count(*)' takes more than an hour. How to optimize it?

I have two queries which our application executes, the only difference between the queries is 'count(*)' column (in one query I have it, in the other I don't).

All the queries are dynamically generated, and we provide our software to clients which they run on their databases (we do not have access to their databases). One of the queries is running very slowly (I couldn't get it to finish after waiting for hours). SQL Tuning Advisor is suggesting accepting sql profile which helps, but that means I would have to tell our client to run it, and accept plan. It would be much better if there was an index we could create to speed up the query.

Here's what the query looks like:

select 
a.company_id
, count(*)
from 
 b
INNER JOIN  a
ON
b.company_id = a.company_id AND
b.sequence_num = a.sequence_num
INNER JOIN  c
ON
b.company_id = c.company_id AND
b.sequence_num = c.sequence_num
INNER JOIN  d 
ON
c.cash_receipt_num = d.cash_receipt_num
INNER JOIN  e
ON
e.code_list_id = 'CONSTANT'
where 
(a.company_id='123')
GROUP BY 
a.company_id
order by
a.company_id ASC

When the query has 'count()', it runs in about a second. The query without 'count()' runs over several hours before I kill it, so I never got it to finish.

Here is a count of records in each table:

select count(*) from a -- 1,007,948
select count(*) from b -- 148,378
select count(*) from c -- 138,901
select count(*) from d -- 136,424
select count(*) from e -- 1

The returned result should be '123', with a count '908,683' if query has count(*) column.

Here's what the execution plan looks like:

--With count (fast):
--------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT  |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   2 |   NESTED LOOPS         |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS        |                   |     1 |    39 |     4   (0)| 00:00:01 |
|   4 |     NESTED LOOPS       |                   |     1 |    33 |     4   (0)| 00:00:01 |
|   5 |      NESTED LOOPS      |                   |     1 |    23 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN | e_KEY00           |     1 |     7 |     1   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL| c                 |     2 |    32 |     2   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | b_KEY00           |     1 |    10 |     1   (0)| 00:00:01 |
|*  9 |     INDEX UNIQUE SCAN  | d_KEY00           |     1 |     6 |     0   (0)| 00:00:01 |
|* 10 |    INDEX RANGE SCAN    | a_KEY00           |     1 |    10 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$3FA9081A
   6 - SEL$3FA9081A / e@SEL$4
   7 - SEL$3FA9081A / c@SEL$2
   8 - SEL$3FA9081A / b@SEL$1
   9 - SEL$3FA9081A / d@SEL$3
  10 - SEL$3FA9081A / a@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("e"."CODE_LIST_ID"='CONSTANT')
   7 - filter("c"."COMPANY_ID"='123')
   8 - access("b"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
   9 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")
  10 - access("a"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=1) '123'[3], COUNT(*)[22]
   2 - (#keys=0) 
   3 - (#keys=0) "b"."SEQUENCE_NUM"[NUMBER,22]
   4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "b"."SEQUENCE_NUM"[NUMBER,22]
   5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   8 - "b"."SEQUENCE_NUM"[NUMBER,22]

-- without count (slow)
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT    |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   2 |   NESTED LOOPS SEMI      |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS SEMI     |                   |     1 |    43 |     6   (0)| 00:00:01 |
|   4 |     NESTED LOOPS         |                   |     1 |    33 |     5   (0)| 00:00:01 |
|   5 |      NESTED LOOPS        |                   |     1 |    23 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN   | e_KEY00           |     1 |     7 |     1   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL  | c                 |     2 |    32 |     2   (0)| 00:00:01 |
|*  8 |      INDEX FAST FULL SCAN| a_KEY00           |     2 |    20 |     1   (0)| 00:00:01 |
|*  9 |     INDEX RANGE SCAN     | b_KEY00           |   139K|  1366K|     1   (0)| 00:00:01 |
|* 10 |    INDEX UNIQUE SCAN     | d_KEY00           |   136K|   799K|     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$3FA9081A
   6 - SEL$3FA9081A / e@SEL$4
   7 - SEL$3FA9081A / c@SEL$2
   8 - SEL$3FA9081A / a@SEL$1
   9 - SEL$3FA9081A / b@SEL$1
  10 - SEL$3FA9081A / d@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("e"."CODE_LIST_ID"='CONSTANT')
   7 - filter("d"."COMPANY_ID"='123')
   8 - filter("a"."COMPANY_ID"='123')
   9 - access("b"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")
       filter("b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
  10 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=1) '123'[3]
   2 - (#keys=0) 
   3 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22]
   4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22], 
       "a"."SEQUENCE_NUM"[NUMBER,22]
   5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   8 - "a"."SEQUENCE_NUM"[NUMBER,22]

I'm suspecting the issue is with statistics. I tried running the following:

begin 
DBMS_STATS.GATHER_SCHEMA_STATS (
ownname => 'owner_of_tables_here',
estimate_percent => 100
);
end;

EXEC dbms_stats.gather_database_stats;
EXEC dbms_stats.gather_database_stats(estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE, options => 'GATHER');

-- for each index mentioned in explain plan:
EXEC DBMS_STATS.GATHER_INDEX_STATS(ownname => 'owner_of_tables_here', indname => 'index name here', estimate_percent => 100)

-- for each of the five tables:
EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => 'owner_of_tables_here', tabname => 'table name here', estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE)

Am I missing something? would client have to run sql tuning advisor and accept the suggested sql profile?

Oracle version: 12.1.0.2.0

Explanation for why the query is the way it is: The application from which this query is taken, allows user to select columns from UI. For example if client only wants to see all companies they have access to, then the query above runs, and if they want to see all companies and how many records there are per each company, then the count(*) query executes. The reason there's a "where company_id = '123' ", is because this particular user only has permissions to see one company, however a different user may have permissions to see all or multiple companies, in which case the dynamically generated filter would be different. (I understand the query looks very odd the way it is, but usually query would have a lot of columns, and no 'group by' clause - which runs fast actually.)

Upvotes: 0

Views: 160

Answers (1)

Marmite Bomber
Marmite Bomber

Reputation: 21075

With some speculation as I don't have your exact data: Your column COMPANY_ID in table A is probably skewed. The table has 1M rows with more than 900K rows with company_id = '123'

Check first the execution plan of a simplified query

select * from a  where company_id = '123'

If it is showing some unrealistic low value such as 1 or 2 check if the column COMPANY_ID has a histogram

select HISTOGRAM from user_tab_columns where table_name = 'A' and COLUMN_NAME = 'COMPANY_ID';

I expect no.

Gather histogram for this column e.g. with

exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'a',granularity=>'all',method_opt=>'FOR COLUMNS COMPANY_ID',estimate_percent => 100,cascade=>TRUE);

Check execution plan of a simplified query

select * from a  where company_id = '123'

This should show some 900K rows. I hope this correct cardinality will prevent the use the table A in a wrong position (as in the slow plan).

Upvotes: 1

Related Questions