Reputation: 273
SELECT *
FROM vehicles t1
WHERE (SELECT COUNT(*) FROM vehicles t2
WHERE t1.pump_number = t2.pump_number
AND t1.updated_at < t2.updated_at
) < 4
AND t1.updated_at >= ?
And I supply '1970-01-01 00:00:00.000000'
for the parameter ?
.
I have around 10k records in the vehicles
table and no index is added. Above query takes around 10-20 seconds in execution.
How I can optimize it to decrease execution time?
Upvotes: 0
Views: 111
Reputation: 656391
To get an equivalent query, use the window function rank()
, not here:row_number()
SELECT *
FROM (
SELECT *
, rank() OVER (PARTITION BY pump_number ORDER BY updated_at DESC) AS rnk
FROM vehicles t1
) sub
WHERE rnk < 4
AND updated_at >= '1970-01-01 0:0';
And it has to be ORDER BY updated_at DESC
, to exclude rows that have more than three older peers for the same pump_number
. In other words:
"Get the three oldest rows per pump_number
- or more if there are ties on updated_at
".
Indexes are not going to help while you read most or all of the table anyway.
If (pump_number, updated_at)
is unique or / and there are relatively few distinct values for pump_number
in vehicles
, you can probably optimize further. There is not enough information in your question.
Upvotes: 0
Reputation: 1269563
This is your query:
SELECT *
FROM vehicles t1
WHERE (SELECT Count(*)
FROM vehicles t2
WHERE t1.pump_number = t2.pump_number AND
t1.updated_at < t2.updated_at
) < 4 AND
t1.updated_at >= ? " , "1970-01-01 00:00:00.000000")]
I would start by writing this using window functions:
select v.*
from (select v.*, row_number() over (partition by pump_number order by updated_at) as seqnum
from vehicles v
) v
where v.seqnum < 4 and t1.updated_at >= ?;
For this query, I would suggest indexes on vehicles(pump_number, updted_at)
and vehicles(updated_at)
.
Upvotes: 0
Reputation: 173
Postgres provide nice admin tool which has option EXPLAIN to see query execution plan . It will give great insights . here is the link for pgadmin in detail http://www.pgadmin.org/docs/1.4/query.html
Also use joins in your query instead of select that will increase your query performance
Upvotes: 1
Reputation: 3196
Try this( the columns in select and group by statement need to be replaced by your own):
SELECT
t1.id,
t1.updated_at,
t1.other_columns
FROM vehicles t1
INNER JOIN vehicles t2
ON t1.pump_number = t2.pump_number
AND t1.updated_at < t2.updated_at
WHERE t1.updated_at >= '1970-01-01 00:00:00.000000'
GROUP BY
t1.id,
t1.updated_at,
t1.other_columns
having count(*)< 4
After this change, you could try to add a index on column pump_number to see if it helps.
Upvotes: 0