Reputation: 35
I have a table like the following:
| col_A | col_B |
|-------|-------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
I want to group and concatenate the results into an array like the following:
| col_A | col_B |
|-------|-------|
| 1,2 | 1,2,3 |
| 3 | 1,2 |
My attempt at writing a query:
SELECT col_A, array_agg(col_B ORDER BY col_B DESC) FROM table GROUP BY col_A;
However, this outputs:
| col_A | col_B |
|-------|---------|
| 1 | {1,2,3} |
| 2 | {1,2,3} |
| 3 | {1,2} |
Upvotes: 3
Views: 8192
Reputation: 656231
In Postgres, ordering rows in a subquery is typically cheaper than adding an ORDER BY
clause to the aggregate function:
SELECT array_agg(col_a) AS col_a, col_b
FROM (
SELECT col_a, array_agg(col_b) AS col_b
FROM (TABLE tbl ORDER BY col_a, col_b) t
GROUP BY 1
ORDER BY 1
) sub
GROUP BY 2
ORDER BY 1;
Produces your desired result exactly, including order of elements and rows.
db<>fiddle here
The 1st ORDER BY
is to sort elements before constructing the array in col_b
.
The 2nd ORDER BY
is to sort elements before constructing the array in col_a
.
The 3rd ORDER BY
is to output rows in the order suggested by the OPs example result.
The query should work the same without 2nd and 3rd ORDER BY
, because the 1st order is carried over. But there are no guarantees. (Things like parallelism might mess with order when querying big tables.) The presented query is guaranteed to produce the desired order.
Related:
About the positional references in GROUP BY
and ORDER BY
:
About (TABLE tbl ORDER BY col_a, col_b)
:
Upvotes: 1
Reputation: 664185
Seems like you want to aggregate col_b
grouped by col_a
, and then aggregate col_a
grouped by the aggregated arrays. You can do this with a nested query, of which you did the first step already:
SELECT array_agg(col_a), col_b
FROM (SELECT col_a, array_agg(DISTINCT col_b ORDER BY col_b) AS col_b
FROM example GROUP BY col_a) grouped_b
GROUP BY col_b;
Upvotes: 3