kayman
kayman

Reputation: 23

PostgreSQL function runs much slower than direct query

My problem started when I was trying to wrap a simple sql statement as a function. The request, which took several ms to complete, began to take seconds. I read everything I could on google but it was all about using function parameters. Then I got rid of the parameters, but that didn't solve the problem.

So, now I have the following SQL statement

select fn_tz(idata.system_timestamp), gp.zname, fc.*
from jv idata   
join gp on st_distance(waypoint,geom)=0
join fc ON 
    idata.fare_code = fc.fare_code  AND 
    fc.validity @> fn_tz(idata.system_timestamp)::date  and 
    fc.polygon_name = gp.name
where idata.item_id='21159704983720122917';

When I run this query directly it takes about 80ms to execute. But if I put the same query without any modification in the function body it takes about 10 sec!

Anyway, I've tried everything that came to my mind

  1. Regular PLPGSQL function (return query ...)
  2. Regular SQL function (select ...)
  3. PLPGSQL function with dynamic SQL query (return query execute 'select ...')
  4. At long last i tried Prepare/execute statement

All above methods give the same result - 10 sec.

Then i run EXPLAIN ANALYZE EXECUTE ... on prepared statement but even from it's output I'm unable to understand why it runs 10 sec

Hash Join  (cost=75.05..962862.24 rows=110 width=8) (actual time=1.075..10290.527 rows=476091 loops=1)
  Hash Cond: ((idata.fare_code = fc.fare_code) AND (gp.name = (fc.polygon_name)::text))
  Join Filter: (fc.validity @> (fn_tz(idata.system_timestamp))::date)
  ->  Nested Loop  (cost=0.00..925098.69 rows=59399 width=54) (actual time=0.298..8300.070 rows=53922 loops=1)
        Join Filter: (st_distance(idata.waypoint, gp.geom) = '0'::double precision)
        Rows Removed by Join Filter: 2212398
        ->  Seq Scan on jv idata  (cost=0.00..4402.99 rows=53999 width=54) (actual time=0.039..33.038 rows=53960 loops=1)
              Filter: (item_id = '21159704983720122917'::text)
              Rows Removed by Filter: 3079
        ->  Materialize  (cost=0.00..13.30 rows=220 width=64) (actual time=0.000..0.003 rows=42 loops=53960)
              ->  Seq Scan on gp  (cost=0.00..12.20 rows=220 width=64) (actual time=0.006..0.025 rows=42 loops=1)
  ->  Hash  (cost=40.22..40.22 rows=2322 width=16) (actual time=0.717..0.717 rows=2268 loops=1)
        Buckets: 4096  Batches: 1  Memory Usage: 141kB
        ->  Seq Scan on fc  (cost=0.00..40.22 rows=2322 width=16) (actual time=0.008..0.332 rows=2322 loops=1)
Planning Time: 0.008 ms
Execution Time: 10324.558 ms

Surprisingly, if I run EXPLAIN ANALYZE on the original query, it also takes about 10 seconds and produces an almost identical execution plan.

My server - managed instance of Postgres 11.8 on Google Cloud Platform

What else can i do/try?

Upd: It seems i need to accent - i'm not looking the way to improve the performance of query. Direct query runs 80 ms and i'm happy with this. I want to find reason - why direct query runs 100 times(!) quicker than function body.

Upvotes: 0

Views: 485

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247980

fn_tz might sap some 2 seconds of the performance, but the main problem is that you are using st_distance rather than st_dwithin, which can be supported with an index:

select fn_tz(idata.system_timestamp), gp.zname, fc.*
from jv idata   
join gp on st_dwithin(waypoint, geom, 0)
join fc ON 
    idata.fare_code = fc.fare_code  AND 
    fc.validity @> fn_tz(idata.system_timestamp)::date  and 
    fc.polygon_name = gp.name
where idata.item_id='21159704983720122917';

The index would be:

CREATE INDEX ON jv USING gist (waypoint);

Upvotes: 2

Related Questions