leas
leas

Reputation: 301

Calculation in nested plpgsql function slower than in query directly?

I have a table with some text and some numeric columns, e.g.:

dimension_1, dimension_2, counter_1, counter_2

and instead of performing the query

SELECT dimension_1, dimension_2, (counter_1, NULLIF(counter_2, 0)) as kpi 
from table order by kpi desc nulls last;

I want to create a function and do:

SELECT dimension_1, dimension_2, func(counter_1, counter_2) as kpi
from table order by kpi desc nulls last;

I used the following implementation in Postgres:

CREATE FUNCTION kpi_latency_ext_msec(val1 numeric, val2 numeric)     
RETURNS numeric AS $func$
BEGIN

RETURN ($1 / NULLIF($2, 0::numeric));             

END; $func$
LANGUAGE PLPGSQL SECURITY DEFINER IMMUTABLE; 

and get the desired result but with slower performance.

From EXPLAIN ANALYZE I get:

1st query (with func):

Sort  (cost=800.85..806.75 rows=2358 width=26) (actual  time=5.534..5.710 rows=2358 loops=1)
Sort Key: (kpi_latency_ext_msec(external_tcp_handshake_latency_sum, external_tcp_handshake_latency_samples))
Sort Method: quicksort  Memory: 281kB
 ->  Seq Scan on counters_by_cgi_rat  (cost=0.00..668.76 rows=2358 width=26) (actual time=0.142..4.233 rows=2358 loops=1)
Filter: (("timestamp" >= '2018-05-10 00:00:00'::timestamp without time zone) AND ("timestamp" < '2018-05-13 00:00:00'::timestamp without time zone) AND (granularity = '1 day'::interval))
Planning time: 0.221 ms
Execution time: 5.881 ms

2nd query (no func):

Sort  (cost=223.14..229.04 rows=2358 width=26) (actual time=1.933..2.114 rows=2358 loops=1)

Sort Key: ((external_tcp_handshake_latency_sum / NULLIF(external_tcp_handshake_latency_samples, 0::numeric)))
Sort Method: quicksort  Memory: 281kB
->  Seq Scan on counters_by_cgi_rat  (cost=0.00..91.06 rows=2358 width=26) (actual time=0.010..1.190 rows=2358 loops=1)
Filter: (("timestamp" >= '2018-05-10 00:00:00'::timestamp without time zone) AND ("timestamp" < '2018-05-13 00:00:00'::timestamp without time zone) AND (granularity = '1 day'::interval))
Planning time: 0.139 ms
Execution time: 2.279 ms

Performing the queries without ORDER BY:

No func:

Seq Scan on table (cost=0.00..91.06 rows=2358 width=26) (actual time=0.016..1.223 rows=2358 loops=1)

With func:

Seq Scan on table (cost=0.00..668.76 rows=2358 width=26) (actual time=0.123..3.518 rows=2358 loops=1)

RESULTS for function no SECURITY DEFINER

Seq Scan on counters_by_cgi_rat  (cost=0.00..668.76 rows=2358 width=26) 
                                  (actual time=0.035..3.718 rows=2358 loops=1)
Filter: (("timestamp" >= '2018-05-10 00:00:00'::timestamp without time zone) 
        AND ("timestamp" < '2018-05-13 00:00:00'::timestamp without time zone) 
        AND (granularity = '1 day'::interval))
  Planning time: 0.086 ms
  Execution time: 3.923 ms

RESULTS for plain query

Seq Scan on counters_by_cgi_rat  (cost=0.00..91.06 rows=2358 width=26)    
                                 (actual time=0.017..1.175 rows=2358 loops=1)  
Filter: (("timestamp" >= '2018-05-10 00:00:00'::timestamp without time zone) 
AND ("timestamp" < '2018-05-13 00:00:00'::timestamp without time zone)     
AND (granularity = '1 day'::interval))
 Planning time: 0.105 ms
 Execution time: 1.356 ms

RESULTS with language sql

 Seq Scan on counters_by_cgi_rat  (cost=0.00..91.06 rows=2358 width=26)          
                                  (actual time=0.011..1.123 rows=2358 loops=1)
 Filter: (("timestamp" >= '2018-05-10 00:00:00'::timestamp without time zone) 
         AND ("timestamp" < '2018-05-13 00:00:00'::timestamp without time zone) 
         AND (granularity = '1 day'::interval))
 Planning time: 0.180 ms
 Execution time: 1.294 ms

FAST ENOUGH with language sql

For sure it is faster than using language plpgsql but slightly slower than the original query (after repeated runs)

========= UPDATE =========

CREATE FUNCTION kpi_latency_ext_msec(val1 numeric, val2 numeric)
RETURNS numeric LANGUAGE sql STABLE AS
'SELECT $1 / NULLIF($2, 0)';

Best results obtained with the above function (even faster than the plain query)

Upvotes: 3

Views: 947

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656666

The poison dart is SECURITY DEFINER. Functions declared SECURITY DEFINER cannot be inlined - and enforce a context switch if I am not mistaken. That can make them considerably more expensive. There is really no need for SECURITY DEFINER in the example. You do not need different privileges for the simple calculation. (Maybe your actual use case is different.)

And there is no need for PL/pgSQL either. Only SQL functions can be inlined - if some additional preconditions are met.

Since all used functions are IMMUTABLE, you should declare the function IMMUTABLE. (Default function volatility is VOLATILE.) You already updated the question accordingly. That allows expression indexes and can help prevent repeated evaluation in some situations. But it never helps with function inlining. Au contraire: it imposes more preconditions (which are met in this case). Quoting the Postgres Wiki on function inlining (last update 2016 at the time of writing):

if the function is declared IMMUTABLE, then the expression must not invoke any non-immutable function or operator

Quoting Tom Lane on pgsql-performance:

The basic point here is that a function marked volatile can be expanded to its contained functions even if they're immutable; but the other way around represents a potential semantic change, so the planner won't do it.

Solution

Try without SECURITY DEFINER:

CREATE FUNCTION kpi_latency_ext_msec(val1 numeric, val2 numeric)     
  RETURNS numeric AS
$func$
BEGIN
   RETURN $1 / NULLIF($2, numeric '0');
END
$func$  LANGUAGE plpgsql IMMUTABLE; 

Should be much faster already.

Or radically simplify to an SQL function:

CREATE FUNCTION f_div0_sql_nullif(val1 numeric, val2 numeric)     
  RETURNS numeric LANGUAGE sql IMMUTABLE AS
$$SELECT $1 / NULLIF($2, numeric '0')$$;

Faster, yet?

Related:

Benchmark

I used IF and CASE expressions at first, but after a_horse_with_no_name's comment I ran extensive tests showing NULLIF to be slightly faster. So I simplified to the original NULLIF variant accordingly.

The major points are still no SECURITY DEFINER, SQL and IMMUTABLE.

db<>fiddle here - pg 10
db<>fiddle here - pg 9.4

Upvotes: 3

Related Questions