Karlo Petravić
Karlo Petravić

Reputation: 83

PostgreSQL query performance and possible optimisation

I managed to write my query to get me correct data but to me it looks pretty bad since i had to use query inside query 3 times and even doe query performance is fine now around 700msec i am afraid it will slow down in future when there will be more data to process. Any info on how bad is this and how can i optimize it would be appreciated.

Edit:

I forgot to mention that tables s3 and s14 have multiple rows with same parcelno and i always need newest row from both tables(which is determined by sdate and stime). If newest row from s14 is newer than newest from s3, or newest row from s3 column emadr2 has same value as column parcelshop_id from table d, data is not shown. Keep in mind that these tables where not created by me and i am only reading data from them.

SELECT 
    q1.ddepot, 
    q1.parcelno, 
    q1.sdate, 
    q1.stime, 
    q1.dpostal, 
    q1.service, 
    q1.lorry,
    q1.zc5x3,
    q1.parcelshop_id,
    q1.country,
    q1.dname1
FROM(
    SELECT DISTINCT ON (q.parcelno) q.* FROM(
        SELECT 
            d.ddepot, 
            d.parcelno, 
            s3.sdate, 
            s3.stime, 
            d.dpostal, 
            d.service, 
            s3.lorry,
            s3.zc5x3,
            d.parcelshop_id,
            s3.country,
            d.dname1,
            s3.emadr1,
            s3.emadr2
        FROM dispatcher.detour_avis d
        LEFT JOIN scans.scandata03 s3 ON d.parcelno = s3.parcelno
        LEFT JOIN scans.scandata14 s14 ON d.parcelno = s14.parcelno 
        WHERE   
            d.ddate > (NOW() - interval '5 day') 
            AND d.parcelshop_id IS NOT NULL 
            AND s3.parcelno IS NOT NULL 
            AND (s14.parcelno IS NULL OR (s14.sdate + s14.stime)::timestamp without time zone < (s3.sdate + s3.stime)::timestamp without time zone)
        ORDER BY s3.sdate, s3.stime DESC
    )q 
    ORDER BY q.parcelno
) q1
WHERE q1.parcelshop_id != q1.emadr2

explain (analyze, verbose):

Subquery Scan on q1  (cost=68552.93..68554.90 rows=84 width=68) (actual time=701.318..701.324 rows=4 loops=1)
  Output: q1.ddepot, q1.parcelno, q1.sdate, q1.stime, q1.dpostal, q1.service, q1.lorry, q1.zc5x3, q1.parcelshop_id, q1.country, q1.dname1
  Filter: ((q1.parcelshop_id)::text <> (q1.emadr2)::text)
  Rows Removed by Filter: 2
  ->  Unique  (cost=68552.93..68553.85 rows=84 width=87) (actual time=701.310..701.314 rows=6 loops=1)
        Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
        ->  Sort  (cost=68552.93..68553.39 rows=184 width=87) (actual time=701.309..701.311 rows=15 loops=1)
              Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
              Sort Key: d.parcelno
              Sort Method: quicksort  Memory: 27kB
              ->  Sort  (cost=68543.71..68544.17 rows=184 width=87) (actual time=701.269..701.269 rows=15 loops=1)
                    Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
                    Sort Key: s3.sdate, s3.stime
                    Sort Method: quicksort  Memory: 27kB
                    ->  Nested Loop  (cost=0.00..68536.79 rows=184 width=87) (actual time=689.775..701.238 rows=15 loops=1)
                          Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
                          Join Filter: ((s14.parcelno IS NULL) OR ((s14.sdate + s14.stime) < (s3.sdate + s3.stime)))
                          Rows Removed by Join Filter: 16
                          ->  Nested Loop Left Join  (cost=0.00..57423.07 rows=455 width=74) (actual time=689.615..700.578 rows=14 loops=1)
                                Output: d.ddepot, d.parcelno, d.dpostal, d.service, d.parcelshop_id, d.dname1, s14.parcelno, s14.sdate, s14.stime
                                ->  Seq Scan on dispatcher.detour_avis d  (cost=0.00..49247.17 rows=455 width=47) (actual time=689.535..700.162 rows=11 loops=1)
                                      Output: d.id, d.parcelno, d.service, d.detour_type, d.ddepot, d.dname1, d.dname2, d.dstreet, d.dhouseno, d.dcountryn, d.dstate, d.dpostal, d.dcity, d.dphone, d.odepot, d.oname1, d.oname2, d.ostreet, d.ohouseno, d.ocoun (...)
                                      Filter: ((d.parcelshop_id IS NOT NULL) AND (d.ddate > (now() - '5 days'::interval)))
                                      Rows Removed by Filter: 985930
                                ->  Append  (cost=0.00..17.92 rows=5 width=33) (actual time=0.036..0.036 rows=1 loops=11)
                                      ->  Seq Scan on scans.scandata14 s14  (cost=0.00..0.00 rows=1 width=58) (actual time=0.000..0.000 rows=0 loops=11)
                                            Output: s14.parcelno, s14.sdate, s14.stime
                                            Filter: ((d.parcelno)::text = (s14.parcelno)::text)
                                      ->  Index Scan using scandata14_2013_pl_indx on scans.scandata14_2013 s14_1  (cost=0.14..0.25 rows=1 width=27) (actual time=0.001..0.001 rows=0 loops=11)
                                            Output: s14_1.parcelno, s14_1.sdate, s14_1.stime
                                            Index Cond: ((d.parcelno)::text = (s14_1.parcelno)::text)
                                      ->  Index Scan using scandata14_2014_pl_indx on scans.scandata14_2014 s14_2  (cost=0.29..4.29 rows=1 width=27) (actual time=0.007..0.007 rows=0 loops=11)
                                            Output: s14_2.parcelno, s14_2.sdate, s14_2.stime
                                            Index Cond: ((d.parcelno)::text = (s14_2.parcelno)::text)
                                      ->  Index Scan using scandata14_2015_pl_indx on scans.scandata14_2015 s14_3  (cost=0.42..6.47 rows=1 width=27) (actual time=0.010..0.010 rows=0 loops=11)
                                            Output: s14_3.parcelno, s14_3.sdate, s14_3.stime
                                            Index Cond: ((d.parcelno)::text = (s14_3.parcelno)::text)
                                      ->  Index Scan using scandata14_2016_pl_indx on scans.scandata14_2016 s14_4  (cost=0.42..6.91 rows=1 width=27) (actual time=0.014..0.015 rows=1 loops=11)
                                            Output: s14_4.parcelno, s14_4.sdate, s14_4.stime
                                            Index Cond: ((d.parcelno)::text = (s14_4.parcelno)::text)
                          ->  Append  (cost=0.00..24.34 rows=5 width=80) (actual time=0.044..0.045 rows=2 loops=14)
                                ->  Seq Scan on scans.scandata03 s3  (cost=0.00..0.00 rows=1 width=186) (actual time=0.000..0.000 rows=0 loops=14)
                                      Output: s3.sdate, s3.stime, s3.lorry, s3.zc5x3, s3.country, s3.emadr1, s3.emadr2, s3.parcelno
                                      Filter: ((s3.parcelno IS NOT NULL) AND ((d.parcelno)::text = (s3.parcelno)::text))
                                ->  Index Scan using scandata03_2013_pl_indx on scans.scandata03_2013 s3_1  (cost=0.14..0.26 rows=1 width=51) (actual time=0.001..0.001 rows=0 loops=14)
                                      Output: s3_1.sdate, s3_1.stime, s3_1.lorry, s3_1.zc5x3, s3_1.country, s3_1.emadr1, s3_1.emadr2, s3_1.parcelno
                                      Index Cond: (((s3_1.parcelno)::text = (d.parcelno)::text) AND (s3_1.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2014_pl_indx on scans.scandata03_2014 s3_2  (cost=0.42..7.55 rows=1 width=53) (actual time=0.009..0.009 rows=0 loops=14)
                                      Output: s3_2.sdate, s3_2.stime, s3_2.lorry, s3_2.zc5x3, s3_2.country, s3_2.emadr1, s3_2.emadr2, s3_2.parcelno
                                      Index Cond: (((s3_2.parcelno)::text = (d.parcelno)::text) AND (s3_2.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2015_pl_indx on scans.scandata03_2015 s3_3  (cost=0.42..8.21 rows=1 width=54) (actual time=0.013..0.013 rows=0 loops=14)
                                      Output: s3_3.sdate, s3_3.stime, s3_3.lorry, s3_3.zc5x3, s3_3.country, s3_3.emadr1, s3_3.emadr2, s3_3.parcelno
                                      Index Cond: (((s3_3.parcelno)::text = (d.parcelno)::text) AND (s3_3.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2016_pl_indx on scans.scandata03_2016 s3_4  (cost=0.43..8.31 rows=1 width=55) (actual time=0.019..0.020 rows=2 loops=14)
                                      Output: s3_4.sdate, s3_4.stime, s3_4.lorry, s3_4.zc5x3, s3_4.country, s3_4.emadr1, s3_4.emadr2, s3_4.parcelno
                                      Index Cond: (((s3_4.parcelno)::text = (d.parcelno)::text) AND (s3_4.parcelno IS NOT NULL))
Planning time: 4.670 ms
Execution time: 701.550 ms

Upvotes: 2

Views: 84

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125224

It seems to me there are lots of unnecessary nesting. Check if this is functionally equivalent

select distinct on (d.parcelno) d.*
from
    dispatcher.detour_avis d
    inner join
    scans.scandata03 s3 on d.parcelno = s3.parcelno
    left join
    scans.scandata14 s14 on d.parcelno = s14.parcelno
where
    d.ddate > now() - interval '5 day'
    and d.parcelshop_id is not null and parcelshop_id != emadr2
    and (
        s14.parcelno is null or
        (s14.sdate + s14.stime)::timestamp < (s3.sdate + s3.stime)::timestamp
    )
order by d.parcelno

As you are doing a left join and putting in the where clause the s3.parcelno is not null condition which contains the right table column join condition you are really doing an inner join. So I just eliminated it from the where clause and turned the left into an inner join

Upvotes: 1

Related Questions