S-Man
S-Man

Reputation: 23726

PostgreSQL Window Function ordering

I've got this table (PostgreSQL 9.3):

x | y  
- | -  
1 | 1  
2 | 2  
2 | 3  
1 | 4

Now I try to get three partitions out of it: Every time the value x is changing (by ordering y), a new dense_rank value should be given. Now I tried the following:

SELECT x, y, dense_rank() over (partition by x order by y) 
FROM table

But with the partition over x the ordering is not working the way I expect. The result is

x  y  dense_rank  
-  -  ----------  
1  1  1     
1  4  2     
2  2  1     
2  3  2 

instead of the expected:

x  y  dense_rank  
-  -  ----------  
1  1  1     
2  2  2     
2  3  2     
1  4  3 

Now I am not sure why the window is not ordered by y.

In the second step I need this rank for a grouping (GROUP BY dense_rank, x). So in the end I need the following result:

x  y      dense_rank  
-  -      ----------  
1  1      1     
2  {2,3}  2         
1  4      3 

Maybe this could be achieved in an easier way?

Upvotes: 2

Views: 2068

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 175964

partition over x the ordering is not working the way I expect

It is working perfectly fine. When you partition by x first 1 and last 1 are in the same group.

Window Functions:

The PARTITION BY list within OVER specifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row.

To get result you want you could use (classic example of gaps and islands problem):

SELECT *, ROW_NUMBER() OVER (ORDER BY y) -
          ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) + 1 AS group_id
FROM tab
ORDER BY group_id

LiveDemo

Output:

╔═══╦═══╦══════════╗
║ x ║ y ║ group_id ║
╠═══╬═══╬══════════╣
║ 1 ║ 1 ║        1 ║
║ 2 ║ 2 ║        2 ║
║ 2 ║ 3 ║        2 ║
║ 1 ║ 4 ║        3 ║
╚═══╩═══╩══════════╝

Warning:
This solution is not general.

EDIT:

More general solution is to utilize LAG to get previous value and windowed SUM:

WITH cte AS
(
  SELECT t1.x, t1.y, LAG(x) OVER(ORDER BY y) AS x_prev
  FROM tab t1
)
SELECT x,y, SUM( CASE WHEN x = COALESCE(x_prev,x) THEN 0 ELSE 1 END) 
            OVER(ORDER BY y) + 1 AS group_id
FROM cte
ORDER BY group_id;

LiveDemo2

Upvotes: 4

Related Questions