Reputation: 327
I'm not as clued up on shortcuts in SQL so I was hoping to utilize the brainpower on here to help speed up a query I'm using. I'm currently using Oracle 8i.
I have a query:
SELECT
NAME_CODE, ACTIVITY_CODE, GPS_CODE
FROM
(SELECT
a.NAME_CODE, b.ACTIVITY_CODE, a.GPS_CODE,
ROW_NUMBER() OVER (PARTITION BY a.GPS_DATE ORDER BY b.ACTIVITY_DATE DESC) AS RN
FROM GPS_TABLE a, ACTIVITY_TABLE b
WHERE a.NAME_CODE = b.NAME_CODE
AND a.GPS_DATE >= b.ACTIVITY_DATE
AND TRUNC(a.GPS_DATE) > TRUNC(SYSDATE) - 2)
WHERE
RN = 1
and this takes about 7 minutes give or take 10 seconds to run.
Now the GPS_TABLE
is currently 6.586.429 rows and continues to grow as new GPS coordinates are put into the system, each day it grows by about 8.000 rows in 6 columns.
The ACTIVITY_TABLE
is currently 1.989.093 rows and continues to grow as new activities are put into the system, each day it grows by about 2.000 rows in 31 columns.
So all in all these are not small tables and I understand that there will always be a time hit running this or similar queries. As you can see I'm already limiting it to only the last 2 days worth of data, but anything to speed it up would be appreciated.
Upvotes: 0
Views: 96
Reputation: 67772
Your strongest filter seems to be the filter on the last 2 days of GPS_TABLE
. It should filter the GPS_TABLE
to about 15k rows. Therefore one of the best candidate for improvement is an index on the column GPS_DATE
.
You will find that your filter TRUNC(a.GPS_DATE) > TRUNC(SYSDATE) - 2
is equivalent to a.GPS_DATE > TRUNC(SYSDATE) - 2
, therefore a simple index on your column will work if you change the query. If you can't change it, you could add a function-based index on TRUNC(GPS_DATE)
.
Once you have this index in place, we need to access the rows in ACTIVITY_TABLE
. The problem with your join is that we will get all the old activities and therefore a good portion of the table. This means that the join as it is will not be efficient with index scans.
I suggest you define an index on ACTIVITY_TABLE(name_code, activity_date DESC)
and a PL/SQL function that will retrieve the last activity in the least amount of work using this index specifically:
CREATE OR REPLACE FUNCTION get_last_activity (p_name_code VARCHAR2,
p_gps_date DATE)
RETURN ACTIVITY_TABLE.activity_code%type IS
l_result ACTIVITY_TABLE.activity_code%type;
BEGIN
SELECT activity_code
INTO l_result
FROM (SELECT activity_code
FROM activity_table
WHERE name_code = p_name_code
AND activity_date <= p_gps_date
ORDER BY activity_date DESC)
WHERE ROWNUM = 1;
RETURN l_result;
END;
Modify your query to use this function:
SELECT a.NAME_CODE,
a.GPS_CODE,
get_last_activity(a.name_code, a.gps_date)
FROM GPS_TABLE a
WHERE trunc(a.GPS_DATE) > trunc(sysdate) - 2
Upvotes: 3
Reputation: 700720
Optimising an SQL query is generally done by:
So, start by adding an index for ACTIVITY_DATE
, and perhaps some other fields that are used in the conditions.
Upvotes: 1