Reputation: 3978
I'm wondering what is happening here in postgres, consider this ready-to-run snippet
select id,
value,
array_agg(id) over (order by value asc) as "array_agg(id) with order",
array_agg(id) over () as "array_agg(id) without order"
from
(
values
(1, 1000),
(2, 2000)
) as rows (id, value)
order by id asc
You may notice that i'm using window function to get aggregrates from the window frame. In both projections "array_agg(id) with order" and "array_agg(id) without order", no filtering is beign made. So i wonder, why the column where i'm ordering the partition is actually giving me the impression of filtering by having just one id in the ordered and two in the not ordered, ¿what verb you put here? for me it would be filtering. The weird thing that gets me more paranoid is that the window frame looks like the same in both partitions when I use a not aggregate function like "lead", please serve yourself:
select id,
value,
lead(id) over (order by value asc) as "lead(id) with order",
lead(id) over () as "lead(id) without order",
array_agg(id) over (order by value asc) as "array_agg(id) with order",
array_agg(id) over () as "array_agg(id) without order"
from
(
values
(1, 1000),
(2, 2000)
) as rows (id, value)
order by id asc
I read the official doc about Window Functions with no clues on this. If any one may explain what is the true behind the milanesa i will more than greatful.
Upvotes: 2
Views: 556
Reputation: 247595
Like the documentation says:
The
frame_clause
specifies the set of rows constituting the window frame, which is a subset of the current partition, for those window functions that act on the frame instead of the whole partition.
Now array_agg
acts on the frame (it aggregates all rows in the frame), while lead
doesn't. The documentation goes on to explain:
The default framing option is
RANGE UNBOUNDED PRECEDING
, which is the same asRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. WithORDER BY
, this sets the frame to be all rows from the partition start up through the current row's lastORDER BY
peer. WithoutORDER BY
, this means all rows of the partition are included in the window frame, since all rows become peers of the current row.
So the presence of ORDER BY
changes the meaning of the default frame of the window.
Upvotes: 3
Reputation: 44343
You have overlooked this paragraph from the tutorial:
By default, if ORDER BY is supplied then the frame consists of all rows from the start of the partition up through the current row, plus any following rows that are equal to the current row according to the ORDER BY clause.
To overcome that, use this construct:
select id,
value,
array_agg(id) over (order by value asc rows between unbounded preceding and unbounded following) as "array_agg(id) with order",
array_agg(id) over () as "array_agg(id) without order"
from
(
values
(1, 1000),
(2, 2000)
) as rows (id, value)
order by id asc;
Upvotes: 3