user1465266
user1465266

Reputation: 681

How to merge all integer arrays from all records into single array in postgres

I have a column which is of type integer array. How can I merge all of them into a single integer array?

For example: If I execute query:

select column_name from table_name

I get result set as:

-[RECORD 1]----------
column_name | {1,2,3}
-[RECORD 2]----------
column_name | {4,5}

How can I get {1,2,3,4,5} as final result?

Upvotes: 55

Views: 43098

Answers (7)

jdmichal
jdmichal

Reputation: 11152

To enrich Craig Ringer's answer, his CREATE AGGREGATE solution is good but reportedly slow. That's because it only has an SFUNC defined, which means it can only aggregate a single item into the array, one at a time.

This can be improved by defining the aggregate as a partial aggregate. In this mode, it can collect sub-arrays -- possibly in parallel -- then merge those arrays into larger arrays. This is done by adding a COMBINEFUNC definition, which in this case is also just array_cat. I also mark the function as parallel safe.

CREATE AGGREGATE public.array_agg_flat(anycompatiblearray) (
    SFUNC = array_cat,
    COMBINEFUNC = array_cat,
    STYPE = anycompatiblearray,
    PARALLEL = SAFE
);

This should drastically reduce the number of array (re)allocations occurring behind the scenes.

Upvotes: 0

Maxim
Maxim

Reputation: 2391

to merge arrays you can use || operator more

to put result in flat list use uunest func

example:

select unnest(ARRAY[1,2] || ARRAY[3,2] || ARRAY[4,5]) as number;

result

Upvotes: 4

mouse500
mouse500

Reputation: 51

string_to_array(string_agg(array_to_string(column_name ,','),','),',')

This can be some clue for your situation. I've done like this.

Upvotes: 3

Craig Ringer
Craig Ringer

Reputation: 324455

Define a trivial custom aggregate:

CREATE AGGREGATE array_cat_agg(anyarray) (
  SFUNC=array_cat,
  STYPE=anyarray
);

and use it:

WITH v(a) AS ( VALUES (ARRAY[1,2,3]), (ARRAY[4,5,6,7]))
SELECT array_cat_agg(a) FROM v;

If you want a particular order, put it within the aggregate call, i.e. array_cat_agg(a ORDER BY ...)

This is roughly O(n log n) for n rows (I think) O(n²) so it is unsuitable for long sets of rows. For better performance you'd need to write it in C, where you can use the more efficient (but horrible to use) C API for PostgreSQL arrays to avoid re-copying the array each iteration.

Upvotes: 25

Vladimir Aleshin
Vladimir Aleshin

Reputation: 181

You could use lateral subquery for that:

select array_agg(u.a)
from (values (array[1, 2, 3]), (array[4, 5])) t (a)
    join lateral unnest(t.a) u (a) on true;

Upvotes: 1

mu is too short
mu is too short

Reputation: 434665

You could use unnest to open up the arrays and then array_agg to put them back together:

select array_agg(c)
from (
  select unnest(column_name)
  from table_name
) as dt(c);

Upvotes: 74

Patrick
Patrick

Reputation: 32199

The only way you can do this is inside a function:

CREATE FUNCTION merge_arrays() RETURNS int[] AS $$
DECLARE
  this record;
  res  int[];
BEGIN
  FOR this IN
    SELECT column_name FROM table_name
  LOOP
    array_cat(res, this.column_name);
  END LOOP;
  RETURN res;
END; $$ LANGUAGE plpgsql;

Then you can

SELECT merge_arrays();

to get the result you are looking for.

This of course hard-codes your table definition into the function, which may (or may not) be an issue. In addition, you may want to put a WHERE clause in the loop query to restrict the records whose arrays you want to append; you might use an additional function parameter to do this.

Keep in mind that you might get a really large array as your table increases in size and that may affect performance. Do you really need all sub-arrays from all records in one large array? Have a look at your application and see if you can do the merge at that level, rather than in a single query.

Upvotes: -2

Related Questions