Reputation: 343
Using Oracle 10gR2 on LINUX, I'm trying to tune the following query.
I'm pretty sure that getting rid of the correlated subqueries and the possible use of some analytic functions may be the optimal way to go, but I'm just not getting it -- especially with the nested correlated subquery that selects on the MAX(TABLE_2.NOTE_DATE). Any help would be much appreciated. Thanks.
EXPLAIN PLAN FOR
SELECT TABLE_4.INCIDENT_TYPE,
TABLE_4.POC_CONTACT,
(SELECT TABLE_2.NOTE_DATE
|| ' '
|| TABLE_1.USER_FIRST_NAME
|| ' '
|| TABLE_1.USER_LAST_NAME
|| ' : '
|| TABLE_2.OTHER_HELP_NOTES
FROM TABLE_1,
TABLE_2
WHERE TABLE_2.USER_ID = TABLE_1.USER_ID
AND TABLE_2.REC_ID = TABLE_4.REC_ID
AND TABLE_2.NOTE_DATE = (SELECT MAX(TABLE_2.NOTE_DATE)
FROM TABLE_2
WHERE TABLE_2.REC_ID = TABLE_4.REC_ID
AND TABLE_2.NOTE_DATE <=
TABLE_4.REPORT_DATE))
AS SUM_OF_SHORTAGE,
(SELECT TABLE_3.NOTE_DATE
|| ' '
|| TABLE_1.USER_FIRST_NAME
|| ' '
|| TABLE_1.USER_LAST_NAME
|| ' : '
|| TABLE_3.HELP_NOTES
FROM TABLE_1,
TABLE_3
WHERE TABLE_3.USER_ID = TABLE_1.USER_ID
AND TABLE_3.REC_ID = TABLE_4.REC_ID
AND TABLE_3.NOTE_DATE = (SELECT MAX(TABLE_3.NOTE_DATE)
FROM TABLE_3
WHERE TABLE_3.REC_ID = TABLE_4.REC_ID
AND TABLE_3.NOTE_DATE <=
TABLE_4.REPORT_DATE)) AS HELP_NOTES,
TABLE_4.REPORT_NUM
FROM TABLE_4
WHERE TABLE_4.SITE_ID = '1';
@C:\ORACLE\PRODUCT\11.2.0\CLIENT_1\RDBMS\ADMIN\UTLXPLS.SQL;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PLAN HASH VALUE: 4036328474
------------------------------------------------------------------------------------------------------------
| ID | OPERATION | NAME | ROWS | BYTES | COST (%CPU)| TIME |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13009 | 2286K| 449 (2)| 00:00:06 |
|* 1 | FILTER | | | | | |
| 2 | NESTED LOOPS | | 3 | 612 | 8 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| TABLE_2 | 3 | 552 | 5 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_TABLE_2_REC_ID | 3 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| TABLE_1 | 1 | 20 | 1 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | TABLE_1_PK | 1 | | 0 (0)| 00:00:01 |
| 7 | SORT AGGREGATE | | 1 | 13 | | |
|* 8 | TABLE ACCESS BY INDEX ROWID| TABLE_2 | 1 | 13 | 5 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | IX_TABLE_2_REC_ID | 3 | | 1 (0)| 00:00:01 |
|* 10 | FILTER | | | | | |
|* 11 | HASH JOIN | | 17 | 4063 | 482 (2)| 00:00:06 |
|* 12 | TABLE ACCESS FULL | TABLE_3 | 17 | 3723 | 474 (2)| 00:00:06 |
| 13 | TABLE ACCESS FULL | TABLE_1 | 1504 | 30080 | 8 (0)| 00:00:01 |
| 14 | SORT AGGREGATE | | 1 | 13 | | |
|* 15 | TABLE ACCESS FULL | TABLE_3 | 1 | 13 | 474 (2)| 00:00:06 |
|* 16 | TABLE ACCESS FULL | TABLE_4 | 13009 | 2286K| 449 (2)| 00:00:06 |
------------------------------------------------------------------------------------------------------------
PREDICATE INFORMATION (IDENTIFIED BY OPERATION ID):
---------------------------------------------------
1 - FILTER("TABLE_2"."NOTE_DATE"= (SELECT /*+ */ MAX("TABLE_2"."NOTE_DATE")
FROM "TABLE_2" "TABLE_2" WHERE "TABLE_2"."REC_ID"=:B1 AND
"TABLE_2"."NOTE_DATE"<=:B2))
4 - ACCESS("TABLE_2"."REC_ID"=:B1)
6 - ACCESS("TABLE_2"."USER_ID"="TABLE_1"."USER_ID")
8 - FILTER("TABLE_2"."NOTE_DATE"<=:B1)
9 - ACCESS("TABLE_2"."REC_ID"=:B1)
10 - FILTER("TABLE_3"."NOTE_DATE"= (SELECT /*+ */
MAX("TABLE_3"."NOTE_DATE") FROM "TABLE_3" "TABLE_3" WHERE
"TABLE_3"."REC_ID"=:B1 AND "TABLE_3"."NOTE_DATE"<=:B2))
11 - ACCESS("TABLE_3"."USER_ID"="TABLE_1"."USER_ID")
12 - FILTER("TABLE_3"."REC_ID"=:B1)
15 - FILTER("TABLE_3"."REC_ID"=:B1 AND "TABLE_3"."NOTE_DATE"<=:B2)
16 - FILTER("TABLE_4"."SITE_ID"=1)
41 ROWS SELECTED
Breaking down this query -- the key problem seems to be the following:
select REC_ID, TO_CHAR(REPORT_DATE,'DD-MON-YY HH:MI:SS') REPORT_DATE,
(SELECT MAX(TABLE_2.note_date) as MAX_DATE
FROM TABLE_2
where TABLE_2.REC_ID = TABLE_1.REC_ID
and TABLE_2.NOTE_DATE <= TABLE_1.REPORT_DATE
) NOTES_MAX_DATE
from TABLE_1 where REC_ID = 121 order by TO_DATE(REPORT_DATE,'DD-MON-YY HH:MI:SS');
Which should return the following:
REC_ID REPORT_DATE NOTES_MAX_DATE
---------------------- ------------------ -------------------------
121 17-APR-10 12:30:00
121 24-APR-10 12:30:00
121 01-MAY-10 12:30:00
121 08-MAY-10 12:30:00
121 15-MAY-10 12:30:00 12-MAY-10
121 22-MAY-10 12:30:01 17-MAY-10
121 29-MAY-10 12:30:01 25-MAY-10
121 05-JUN-10 12:30:00 25-MAY-10
8 rows selected
The output needs to be the same as the above. I tried creating a join as follows:
SELECT TABLE_1.REC_ID, TO_CHAR(TABLE_1.REPORT_DATE,'DD-MON-YY HH:MI:SS') REPORT_DATE, MAX(TABLE_2.NOTE_DATE) AS NOTES_MAX_DATE
FROM TABLE_2,
TABLE_1
where TABLE_2.REC_ID = TABLE_1.REC_ID
AND TABLE_2.NOTE_DATE <= TABLE_1.REPORT_DATE
and ( TABLE_1.SITE_ID = '1' )
and TABLE_1.REC_ID = 121
group by TABLE_1.REC_ID, TABLE_1.REPORT_DATE
order by TO_DATE(REPORT_DATE,'DD-MON-YY HH:MI:SS');
But that yields:
REC_ID REPORT_DATE NOTES_MAX_DATE
---------------------- ------------------ -------------------------
121 15-MAY-10 12:30:00 12-MAY-10
121 22-MAY-10 12:30:01 17-MAY-10
121 29-MAY-10 12:30:01 25-MAY-10
121 05-JUN-10 12:30:00 25-MAY-10
So I'm really stumped. Any ideas? -- Thanks.
Upvotes: 1
Views: 1830
Reputation: 17429
Below is a version that only gets the should remove the correlated sub-queries. It does still use sub-queries, but, as they're in the max
once and FROM
clause rather than the SELECT
clause, the database should do a better job of resolving them. It's probably possible to remove those sub-queries as well, but it's more readable this way. This version also uses the SQL-99 syntax for joins, which is generally considered preferable.
SELECT table_4.incident_type,
table_4.poc_contact,
t2.sum_of_shortage,
t3.help_notes,
table_4.report_num
FROM table_4
LEFT JOIN (SELECT table_2.rec_id,
table_2.note_date
|| ' '
|| table_1.user_first_name
|| ' '
|| table_1.user_last_name
|| ' : '
|| table_2.other_help_notes
AS sum_of_shortage
FROM table_1
JOIN table_2
ON table_2.user_id = table_1.user_id
WHERE table_2.note_date =
(SELECT MAX(table_2.note_date) AS max_date
FROM table_2
WHERE table_2.rec_id = table_4.rec_id
AND table_2.note_date <= table_4.report_date)) t2
ON t2.rec_id = table_4.rec_id
LEFT JOIN (SELECT table_3.rec_id,
table_3.note_date
|| ' '
|| table_1.user_first_name
|| ' '
|| table_1.user_last_name
|| ' : '
|| table_3.other_help_notes
AS help_notes
FROM table_1
JOIN table_3
ON table_3.user_id = table_1.user_id
WHERE table_2.note_date =
(SELECT MAX(table_3.note_date) AS max_date
FROM table_3
WHERE table_3.rec_id = table_4.rec_id
AND table_3.note_date <= table_4.report_date)) t3
ON t3.rec_id = table_4.rec_id
WHERE table_4.site_id = '1';
@shawno: You're right, the with
clause was flawed because I misread your initial query. Above is a corrected version. Because the max
values are specific to each row, the method that you were already using to get those values is probably the most efficient. Your best option for optimizing this appears to just be moving the sub-queries from the select
clause to the from
clause.
Also, this is an untested solution, as I have neither your table structure nor your data. The best I can do without putting far too much work into it is to verified that the syntax is valid.
Upvotes: 2