Optimize slow correlated query in Oracle SQL

Question

I made a query which works, but is a little slow in my opinion. When I suppress the output to 10 rows, it takes 13 minutes to perform the query. This is the query, stripped from some stuff:

SELECT
    (SELECT ANSWER
        FROM (
            SELECT to_number(fiit.ANSWER, '999') ANSWER,
                    foin.CLIENT_ID id,
                    foin.STARTDATE start_date,
                    row_number() over(PARTITION BY foin.CLIENT_ID ORDER BY foin.FORM_ID ASC) rnk
                FROM forms_filled foin, forms_items_filled fiit, treatment trtm
                WHERE foin.FORM_ID = fiit.FORM_ID
                AND foin.CLIENT_ID = trtm.CLIENT_ID
                AND fiit.FORM_NUMBER = 607
                AND fiit.FORM_ITEM_NUMBER = 3779
                AND length(fiit.ANSWER) >= 1
                AND trtm.TREATMENTCODE = 'K'
                AND trtm.ENDDATE BETWEEN TRUNC(to_date('01/01/2014', 'dd/mm/yyyy'), 'DDD') AND TRUNC(to_date('31/12/2014', 'dd/mm/yyyy'), 'DDD')
                AND foin.STARTDATE BETWEEN trtm.STARTDATUM AND NVL(trtm.ENDDATE, to_date('01/01/9999', 'dd/mm/yyyy'))
                ) inn
    WHERE rnk = 1
    AND inn.id = client.CLIENT_ID
    ) form1
FROM treatment trtm, CLIENT client
WHERE trtm.TREATMENTCODE = 'K'
AND client.CLIENT_ID = trtm.CLIENT_ID
AND trtm.ENDDATE BETWEEN TRUNC(to_date('01/01/2014', 'dd/mm/yyyy'), 'DDD') AND TRUNC(to_date('31/12/2014', 'dd/mm/yyyy'), 'DDD')

The outer query results in 175 clients who have a particular treatment code and have a treatment enddate in 2014. Now for each of these clients a lot of other data is retrieved (like name, age, treatment hours) which is not relevant and I left out now. Then there are about 30 similar subqueries, which retrieve answers from forms. I used a correlated query, because to retrieve the answers from these forms, the client id must be known. If that was the only thing the subquery needed to find the data, it wouldn't be a problem, but there is an other requirement: the retrieved forms must be filled in within the treatment period, because I couldn't find a way to push this data from the outer query to the sub-sub-query, I query this again in the subsubquery, which is causing the slow speed.

The reason to have a subquery and a subsubquery is because the N-th ranking answer from a form must be found. In a previous version of my code I didn't have the treatmentcode, treatment start and enddate requirements in the where clause of the subsubquery. This caused the subsubquery to come up with e.g. 4 results which were ranked 1,2,3,4 but not necessarily of forms created within the treatment period, which is wrong.

So adding these lines:

AND trtm.TREATMENTCODE = 'K'
AND trtm.ENDDATE BETWEEN TRUNC(to_date('01/01/2014', 'dd/mm/yyyy'), 'DDD')
AND TRUNC(to_date('31/12/2014', 'dd/mm/yyyy'), 'DDD')
AND foin.STARTDATE BETWEEN trtm.STARTDATUM AND NVL(trtm.ENDDATE, to_date('01/01/9999', 'dd/mm/yyyy'))

caused the query to be correct, where it was not completely correct before. They also caused the query to take several hours, instead of ~40 seconds for 175 rows.

My question now is, how can I rewrite this query to make it faster? I use Oracle 11.2.40 in combination with Toad Data Point 3.5, but I can't see the explain plan unfortunately.

Gordon Linoff · Accepted Answer

If you use the keep keyword to get the first value, you can dispense with the nested subqueries. This, in turn, allows you to use a query correlated with the outer query, so you don't have to re-calculate the results for all rows to get the value for a given row.

The query would look like:

SELECT (SELECT max(to_number(fiit.ANSWER, '999')) keep (dense_rank first order by foin.FORM_ID ASC)
        FROM forms_filled foin JOIN
             forms_items_filled fiit
             ON foin.FORM_ID = fiit.FORM_ID
        WHERE foin.CLIENT_ID = trtm.CLIENT_ID AND
              fiit.FORM_NUMBER = 607
              fiit.FORM_ITEM_NUMBER = 3779 AND
              length(fiit.ANSWER) >= 1 AND
              foin.STARTDATE BETWEEN trtm.STARTDATUM AND NVL(trtm.ENDDATE, DATE '1999-01-01')
       )

I also encourage you to use modern explicit join syntax and the date keyword for expressing date constants.

Optimize slow correlated query in Oracle SQL

Answers (2)

Related Questions