Guan Hui
Guan Hui

Reputation: 1

Optimizing query for huge database

I am using a SQL query for monthly extraction of data from a huge PostgreSQL replica database which stores location data. Currently I have split it into 3 parts (10 days each) and each part is taking roughly 21 hours to complete. Was wondering if there is any way to optimize the query and process the data more quickly.

select
  asset_dcs.registration_number,
  date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') AS bussines_date,
  min(seq_num) as min_seq_num,
  max(seq_num) as max_seq_num,
  count (*) row_count
from dcs_posn
LEFT OUTER JOIN asset_dcs on (asset_id = asset_dcs.id)
where 1=1 
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') > '2015-12-31'
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') <= '2016-01-10'
group by asset_id, bussines_date, asset_dcs.registration_number;

Upvotes: 0

Views: 62

Answers (1)

Patrick
Patrick

Reputation: 32199

The most obvious improvement is in your filter:

where 1=1 
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') > '2015-12-31'
and date_trunc('day', transmitter_received_dttm + '08:00:00' + '-04:00:00') <= '2016-01-10'

should be rewritten as:

WHERE transmitter_received_dttm > '2015-12-31 20:00:00'::timestamp
  AND transmitter_received_dttm <= '2016-01-10 20:00:00'::timestamp

The date_trunc() function is very wasteful the way that you use it.

Otherwise you should add an EXPLAIN ... to your question so that we can see the query plan, as well as other performance-related information such as any indexes.

Upvotes: 1

Related Questions