Reputation: 11
I am programming a stored procedure in PostgreSQL. The algorithm should handle a 2 dimensional array of double precision
numbers.
As far as I have investigated array operations in Postgres are generic and quite heavy. The simple example I'm trying to prove has an excessive computational cost.
Example:
CREATE OR REPLACE FUNCTION fill_2d_array( rows integer, cols integer)
RETURNS integer AS
$BODY$
DECLARE
img double precision[][];
i integer; j integer;
cont integer;
BEGIN
img := ARRAY( SELECT 0 FROM generate_series(1, filas * columnas) ) ;
cont:= 0;
For i IN 1..rows LOOP
For j IN 1..cols LOOP
img[i * cols + j] := (i * cols + j)::double precision;
cont := cont + 1;
END LOOP;
END LOOP;
return cont;
END;
$BODY$
LANGUAGE plpgsql;
Can someone help me find an alternative path or an improvement to handle two-dimensional arrays?
Upvotes: 0
Views: 2067
Reputation: 656321
Declaring the dimensions of an array variable, like float8[][]
for a 2-dimensional array, only serves documentation. Consider details in this related answer:
You confused 1-dimenstional and 2-dimensional arrays. While declaring a 2-dimenstional array (to no effect), you only make it out to be a 1-dimensional array.
To initialize an array, use array_fill()
:
img := array_fill(0, ARRAY[rows,cols])
This example produces a 2-dimensional array - as opposed to your faulty statement, producing a 1-dimensional array:
img := ARRAY( SELECT 0 FROM generate_series(1, rows* cols) ); -- wrong!
The displayed array subscripts img[i * cols + j]
hardly make sense. The maximum would be twice of what you initialized, resulting in "out-of-bound" errors. I suppose you mean img[i][j]
.
Everything put together it could work like this:
CREATE OR REPLACE FUNCTION f_array_fill(rows integer, cols integer, OUT img float8[][])
LANGUAGE plpgsql AS
$func$
DECLARE
i int;
j int;
BEGIN
img := array_fill(0, ARRAY[rows,cols]);
FOR i IN 1 .. rows LOOP
FOR j IN 1 .. cols LOOP
img[i][j] := (i * cols + j)::float8;
END LOOP;
END LOOP;
END
$func$;
Call:
SELECT f_array_fill(2,3);
Result:
{{4,5,6},{7,8,9}}
To make the function useful, return the produced array. Using an OUT
parameter for that.
Looping and individual assignments are comparatively slow in PL/pgSQL. See:
Array handling performs particularly poorly as explained by @Craig in this related answer:
A set-based approach used to scale better with old versions. But the difference is mostly gone with modern Postgres.
In Postgres 9.4 or older, we needed a custom aggregate function. array_agg()
only produced 1-dimensional arrays. Since Postgres 9.5 we can just use built-in functions. See.
SQL function doing the the same as the above:
CREATE OR REPLACE FUNCTION f_array_fill_sql(_rows integer, _cols integer)
RETURNS float8[][] -- 2nd dim only for documentation
LANGUAGE sql AS
$func$
SELECT array_agg(arr1) AS arr2
FROM (
SELECT array_agg((i * $2 + j)::float8) AS arr1
FROM generate_series(1, _rows) i
CROSS JOIN generate_series(1, _cols) j
GROUP BY i
ORDER BY i
) sub
$func$;
Call:
SELECT f_array_fill_sql(2,3);
Result:
{{4,5,6},{7,8,9}}
Upvotes: 2
Reputation: 4824
db fiddle link
In postgresql 14, array_cat
changed. see: postgresql14 changes
also https://www.postgresql.org/message-id/CAKFQuwbKPYOuaexih1fMDPKa7kTjLSVjNdWCRHn7iJ2%2BoukpHg%40mail.gmail.com
Then from the test doc (around line 732). Then in Postgresql 14, we need to recreate aggregate again.
create or replace aggregate array_agg_mult(anycompatiblearray) (
sfunc = array_cat,
stype = anycompatiblearray,
initcond = '{}'
);
create or replace function
f_array_fill_sql(_rows integer,_cols integer)
returns float8[][] AS
$func$
select array_agg_mult(array[arr1]) as arr2
from (
select array_agg((i * $2 + j )::float8) arr1
from generate_series(1,$1) i
cross join generate_series(1,$2) j
group by i order by i
) sub
$func$ language sql;
Upvotes: 0