Baker
Baker

Reputation: 505

PostgreSQL: How to run query in parallel in function?

I have a simple query

select bod_vykonu_kod, count(1)
from cdc_s5_zdroj
group by 1
order by 1 desc;

Which runs in parallel like it should.
Explain analyze: https://explain.depesz.com/s/auVt

Then if I put the same query in function it does not run in parallel. I tried it as STABLE or VOLATILE and still no parallel. I also added PARALLEL SAFE but no difference.

CREATE OR REPLACE FUNCTION 
test_par () 
returns table (
t_column1 bigint,
t_column2 bigint
)

volatile
PARALLEL SAFE
AS $dbvis$

BEGIN

RETURN QUERY
select bod_vykonu_kod, count(1)
from cdc_s5_zdroj
group by 1
order by 1 desc;

END;
$dbvis$ LANGUAGE plpgsql

Explain analyze volatile: https://explain.depesz.com/s/glFO
Explain analyze stable: https://explain.depesz.com/s/vnXO
Explain analyze stable and parallel safe: https://explain.depesz.com/s/QlKM

PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
max_parallel_workers = 8
max_parallel_workers_per_gather = 4

Am I doing something wrong or functions does not support parallel execution like this?

Upvotes: 3

Views: 6764

Answers (2)

Jeremy
Jeremy

Reputation: 6723

I had this same issue with a plpgsql function. RETURN QUERY never ran in parallel. The only solution I could find was to do something like this, because CREATE TABLE AS will use parallel processing:

CREATE OR REPLACE FUNCTION 
test_par () 
returns table (
t_column1 bigint,
t_column2 bigint
)

volatile
AS $dbvis$

BEGIN
CREATE TEMPORARY TABLE my_temp ON COMMIT DROP AS
select bod_vykonu_kod, count(1)
from cdc_s5_zdroj
group by 1
order by 1 desc;

RETURN QUERY SELECT * FROM my_temp;
DROP TABLE IF EXISTS my_temp;
END;
$dbvis$ LANGUAGE plpgsql

It's not ideal, but for my situation, it was still a lot faster than not using parallel processing.

Update: as of Postures 14, return query will now use parallel workers. We no longer need to use the create table workaround.

Upvotes: 5

Laurenz Albe
Laurenz Albe

Reputation: 246698

I dug into the code to see why RETURN QUERY does not support parallel execution.

The reason is that it uses a cursor to fetch query result in batches of 50, and queries executed using a cursor are not run in parallel (because execution could be suspended).

This is the relevant code in function exec_stmt_return_query from src/pl/plpgsql/src/pl_exec.c:

exec_stmt_return_query(PLpgSQL_execstate *estate,
                       PLpgSQL_stmt_return_query *stmt)
{
[...]
    if (stmt->query != NULL)
    {
        /* static query */
        exec_run_select(estate, stmt->query, 0, &portal);
    }
[...]
    while (true)
    {
        uint64      i;

        SPI_cursor_fetch(portal, true, 50);

[...]

Upvotes: 4

Related Questions