Reputation: 1
I am using a SQL query for monthly extraction of data from a huge PostgreSQL replica database which stores location data. Currently I have split it into 3 parts (10 days each) and each part is taking roughly 21 hours to complete. Was wondering if there is any way to optimize the query and process the data more quickly.
select
asset_dcs.registration_number,
date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') AS bussines_date,
min(seq_num) as min_seq_num,
max(seq_num) as max_seq_num,
count (*) row_count
from dcs_posn
LEFT OUTER JOIN asset_dcs on (asset_id = asset_dcs.id)
where 1=1
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') > '2015-12-31'
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') <= '2016-01-10'
group by asset_id, bussines_date, asset_dcs.registration_number;
Upvotes: 0
Views: 62
Reputation: 32199
The most obvious improvement is in your filter:
where 1=1
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') > '2015-12-31'
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') <= '2016-01-10'
should be rewritten as:
WHERE transmitter_received_dttm > '2015-12-31 20:00:00'::timestamp
AND transmitter_received_dttm <= '2016-01-10 20:00:00'::timestamp
The date_trunc()
function is very wasteful the way that you use it.
Otherwise you should add an EXPLAIN ...
to your question so that we can see the query plan, as well as other performance-related information such as any indexes.
Upvotes: 1