Reputation: 23726
I've got this table (PostgreSQL 9.3
):
x | y
- | -
1 | 1
2 | 2
2 | 3
1 | 4
Now I try to get three partitions out of it: Every time the value x is changing (by ordering y), a new dense_rank
value should be given. Now I tried the following:
SELECT x, y, dense_rank() over (partition by x order by y)
FROM table
But with the partition over x the ordering is not working the way I expect. The result is
x y dense_rank
- - ----------
1 1 1
1 4 2
2 2 1
2 3 2
instead of the expected:
x y dense_rank
- - ----------
1 1 1
2 2 2
2 3 2
1 4 3
Now I am not sure why the window is not ordered by y.
In the second step I need this rank for a grouping (GROUP BY dense_rank, x). So in the end I need the following result:
x y dense_rank
- - ----------
1 1 1
2 {2,3} 2
1 4 3
Maybe this could be achieved in an easier way?
Upvotes: 2
Views: 2068
Reputation: 175964
partition over x the ordering is not working the way I expect
It is working perfectly fine. When you partition by x first 1
and last 1
are in the same group.
The PARTITION BY list within OVER specifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row.
To get result you want you could use (classic example of gaps and islands problem):
SELECT *, ROW_NUMBER() OVER (ORDER BY y) -
ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) + 1 AS group_id
FROM tab
ORDER BY group_id
Output:
╔═══╦═══╦══════════╗
║ x ║ y ║ group_id ║
╠═══╬═══╬══════════╣
║ 1 ║ 1 ║ 1 ║
║ 2 ║ 2 ║ 2 ║
║ 2 ║ 3 ║ 2 ║
║ 1 ║ 4 ║ 3 ║
╚═══╩═══╩══════════╝
Warning:
This solution is not general.
EDIT:
More general solution is to utilize LAG
to get previous value and windowed SUM
:
WITH cte AS
(
SELECT t1.x, t1.y, LAG(x) OVER(ORDER BY y) AS x_prev
FROM tab t1
)
SELECT x,y, SUM( CASE WHEN x = COALESCE(x_prev,x) THEN 0 ELSE 1 END)
OVER(ORDER BY y) + 1 AS group_id
FROM cte
ORDER BY group_id;
Upvotes: 4