user2830358
user2830358

Reputation: 11

Slow handling of 2-dimensional arrays in plpgsql

I am programming a stored procedure in PostgreSQL. The algorithm should handle a 2 dimensional array of double precision numbers.

As far as I have investigated array operations in Postgres are generic and quite heavy. The simple example I'm trying to prove has an excessive computational cost.

Example:

CREATE OR REPLACE FUNCTION fill_2d_array( rows integer, cols integer) 
  RETURNS integer AS
$BODY$

DECLARE

img double precision[][];

i integer; j integer;
cont integer;

BEGIN

img  := ARRAY( SELECT 0 FROM generate_series(1, filas * columnas) ) ; 
cont:= 0;
For i IN 1..rows LOOP
    For j IN 1..cols LOOP
        img[i * cols + j] := (i * cols + j)::double precision;
        cont := cont + 1;
    END LOOP;
END LOOP;

return cont;
END;
$BODY$
  LANGUAGE plpgsql;

Can someone help me find an alternative path or an improvement to handle two-dimensional arrays?

Upvotes: 0

Views: 2067

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656321

Procedural function

Basic problems

Declaring the dimensions of an array variable, like float8[][] for a 2-dimensional array, only serves documentation. Consider details in this related answer:

You confused 1-dimenstional and 2-dimensional arrays. While declaring a 2-dimenstional array (to no effect), you only make it out to be a 1-dimensional array.

To initialize an array, use array_fill():

img := array_fill(0, ARRAY[rows,cols])

This example produces a 2-dimensional array - as opposed to your faulty statement, producing a 1-dimensional array:

img  := ARRAY( SELECT 0 FROM generate_series(1, rows* cols) );  -- wrong!

The displayed array subscripts img[i * cols + j] hardly make sense. The maximum would be twice of what you initialized, resulting in "out-of-bound" errors. I suppose you mean img[i][j].

Working version

Everything put together it could work like this:

CREATE OR REPLACE FUNCTION f_array_fill(rows integer, cols integer, OUT img float8[][]) 
  LANGUAGE plpgsql AS
$func$
DECLARE
   i int;
   j int;
BEGIN
   img := array_fill(0, ARRAY[rows,cols]);

   FOR i IN 1 .. rows LOOP
       FOR j IN 1 .. cols LOOP
           img[i][j] := (i * cols + j)::float8;
       END LOOP;
   END LOOP;
END
$func$;

Call:

SELECT f_array_fill(2,3);

Result:

{{4,5,6},{7,8,9}}

To make the function useful, return the produced array. Using an OUT parameter for that.

Superior set-based version

Looping and individual assignments are comparatively slow in PL/pgSQL. See:

Array handling performs particularly poorly as explained by @Craig in this related answer:

A set-based approach used to scale better with old versions. But the difference is mostly gone with modern Postgres.

Aggregate function for multi-dimensional arrays

In Postgres 9.4 or older, we needed a custom aggregate function. array_agg() only produced 1-dimensional arrays. Since Postgres 9.5 we can just use built-in functions. See.

SQL function doing the the same as the above:

CREATE OR REPLACE FUNCTION f_array_fill_sql(_rows integer, _cols integer)
  RETURNS float8[][]  -- 2nd dim only for documentation
  LANGUAGE sql AS
$func$
SELECT array_agg(arr1) AS arr2
FROM  (
   SELECT array_agg((i * $2 + j)::float8) AS arr1
   FROM        generate_series(1, _rows) i
   CROSS  JOIN generate_series(1, _cols) j
   GROUP  BY i
   ORDER  BY i
   ) sub
$func$;

Call:

SELECT f_array_fill_sql(2,3);

Result:

{{4,5,6},{7,8,9}}

Upvotes: 2

jian
jian

Reputation: 4824

db fiddle link
In postgresql 14, array_cat changed. see: postgresql14 changes also https://www.postgresql.org/message-id/CAKFQuwbKPYOuaexih1fMDPKa7kTjLSVjNdWCRHn7iJ2%2BoukpHg%40mail.gmail.com

Then from the test doc (around line 732). Then in Postgresql 14, we need to recreate aggregate again.

create or replace aggregate array_agg_mult(anycompatiblearray) (
sfunc = array_cat,
stype = anycompatiblearray,
initcond = '{}'
);


create or replace function
    f_array_fill_sql(_rows integer,_cols integer)
returns float8[][] AS
$func$
    select array_agg_mult(array[arr1]) as arr2
    from (
        select array_agg((i * $2 + j )::float8) arr1
            from generate_series(1,$1) i
        cross join generate_series(1,$2) j
        group by i order by i
        ) sub
$func$ language sql;

Upvotes: 0

Related Questions