Karl Bartel
Karl Bartel

Reputation: 3434

Avoid multiple calls on same function when expanding composite result

I have an SQL function retuning a composite result.

CREATE TYPE result_t AS (a int, b int, c int, d numeric);

CREATE OR REPLACE FUNCTION slow_function(int)
RETURNS result_t
AS $$
    -- just some placeholder code to make it slow
    SELECT 0, 0, 0, (
        SELECT sum(ln(i::numeric))
        FROM generate_series(1, $1) i
    )
$$ LANGUAGE sql IMMUTABLE;

When calling the function, I would like to have the parts of the composite type expanded into several columns. That works fine when I call:

SELECT (slow_function(i)).*
FROM generate_series(0, 200) i

a     b     c     d                    
----  ----  ----  -------------------- 
0     0     0     (null)               
0     0     0     0                    
0     0     0     0.6931471805599453   
0     0     0     1.791759469228055
...
Total runtime: 6196.754 ms

Unfortunately, this causes the function to be called once per result column, which is unnecessarily slow. This can be tested by comparing the run time with a query, which directly returns the composite result and runs four times as fast:

SELECT slow_function(i)
FROM generate_series(0, 200) i
...
Total runtime: 1561.476 ms

The example code is also at http://sqlfiddle.com/#!15/703ba/7

How can I get the result with multiple columns without wasting CPU power?

Upvotes: 2

Views: 263

Answers (3)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656401

A CTE is not even necessary. A plain subquery does the job.

SELECT i, (f).*                     -- decompose here
FROM  (
   SELECT i, (slow_func(i)) AS f    -- do not decompose here
   FROM   generate_series(1, 3) i
   ) sub;

Be sure not to decompose the record (composite result) of the function in the subquery. Do that in the outer query.
Requires a registered row type. Not possible with anonymous records.

Or, what @Richard wrote, a LATERAL JOIN works, too. The syntax can be simpler:

SELECT * FROM generate_series(1, 3) i, slow_func(i) f

LATERAL is applied implicitly in Postgres 9.3 or later.

A function can stand on its own in the FROM clause, doesn't have to be wrapped in an additional sub-select. Just like a table in its place.

fiddle with EXPLAIN VERBOSE output for all variants. You can see multiple evaluation of the function if it happens.
Old sqlfiddle

COST setting

Generally (should not matter for this particular query), make sure to apply a high cost setting to your function, so the planner knows to avoid evaluating more often then necessary. Like:

CREATE OR REPLACE FUNCTION slow_function(int)
  RETURNS result_t
  LANGUAGE sql IMMUTABLE COST 100000 AS
$func$
    -- expensive body
$func$;

The manual:

Larger values cause the planner to try to avoid evaluating the function more often than necessary.

Upvotes: 1

Karl Bartel
Karl Bartel

Reputation: 3434

One way to work around this problem is by using a WITH clause.

WITH t AS (
    SELECT slow_function(i) AS s
    FROM generate_series(0, 200) i
)
SELECT (s).*
FROM t

This works because the result of the WITH clause gets materialized before the rest of the query is executed. But this solution is often bad for more complex cases, because it greatly reduces the query optimizer's options for improving the query execution in other ways. So I'm still looking for a better way to solve this problem.

Upvotes: 1

Richard Huxton
Richard Huxton

Reputation: 22893

Perhaps a LATERAL subquery is what you want.

SELECT t.id, f.* FROM some_table t, LATERAL (SELECT slow_func(t.id)) f

That will call the function once for each row, and then "unwrap" the result into columns in the output. Any subquery will do for the "unwrapping" but LATERAL is what lets you reference columns from other clauses.

I believe LATERAL was introduced in PostgreSQL 9.3

Upvotes: 2

Related Questions