Reputation: 362
I have in my table rows ordered by rank. All rank values are different. Also I have color column with possible values (red, blue, green). I need iterate over earch row (named it X for example) and check if in next n rows exist row (named it Y for example) where X.color == Y.color then need move Y row and insert it after X row. For example n = 5 and starting table:
id rank color
1 0.8 red
3 0.76 blue
4 0.75 blue
7 0.743 green
5 0.742 red
6 0.741 green
8 0.6 blue
9 0.5 blue
11 0.47 green
12 0.45 blue
10 0.4 red
1 Iterate, we have row with color red in next n rows, then:
id rank color
1 0.8 red
5 0.742 red
3 0.76 blue
4 0.75 blue
7 0.743 green
6 0.741 green
8 0.6 blue
9 0.5 blue
11 0.47 green
12 0.45 blue
10 0.4 red
2 Iterate, we don't have red in next n rows, then no changes
3 Iterate, we have blue in next n rows, but next blue row already after current.
4 Iterate, we have blue in next n rows, then:
id rank color
1 0.8 red
5 0.742 red
3 0.76 blue
4 0.75 blue
8 0.6 blue
7 0.743 green
6 0.741 green
9 0.5 blue
11 0.47 green
12 0.45 blue
10 0.4 red
5 Iterate, we have blue in next n rows, then:
id rank color
1 0.8 red
5 0.742 red
3 0.76 blue
4 0.75 blue
8 0.6 blue
9 0.5 blue
7 0.743 green
6 0.741 green
11 0.47 green
12 0.45 blue
10 0.4 red
6 Iterate, we have blue in next n rows, then:
id rank color
1 0.8 red
5 0.742 red
3 0.76 blue
4 0.75 blue
8 0.6 blue
9 0.5 blue
12 0.45 blue
7 0.743 green
6 0.741 green
11 0.47 green
10 0.4 red
7 Iterate, we have green in next n rows, but next green row already after current.
8 Iterate, we have green in next n rows, but next green row already after current.
9 Iterate, we have green in next n rows, but next green row already after current.
10 Iterate no changes.
Someone has an idea how to solve this problem, I tried to fasten window functions, but it is not clear whether it can be done with the help of them, maybe some other mechanism is needed?
Upvotes: 0
Views: 145
Reputation: 6723
It's really common for folks relatively new to SQL to think about iteration when faced with a problem like this, but it is possible to do this without any iteration. Let's break it down a bit. First, let's find the row_id of all the data, based on the rank column:
with data as (select * from (values
(1,0.8 ,'red'),
(3,0.76 ,'blue'),
(4,0.75 ,'blue'),
(7,0.743 ,'green'),
(5,0.742 ,'red'),
(6,0.741 ,'green'),
(8,0.6 ,'blue'),
(9,0.5 ,'blue'),
(11,0.47 ,'green'),
(12,0.45 ,'blue'),
(10,0.4 ,'red')
) v(id, rank, color)
)
select id, rank, color,
row_number() over (order by rank desc) as row_id
FROM data
ORDER BY rank desc;
id | rank | color | row_id
----+-------+-------+--------
1 | 0.8 | red | 1
3 | 0.76 | blue | 2
4 | 0.75 | blue | 3
7 | 0.743 | green | 4
5 | 0.742 | red | 5
6 | 0.741 | green | 6
8 | 0.6 | blue | 7
9 | 0.5 | blue | 8
11 | 0.47 | green | 9
12 | 0.45 | blue | 10
10 | 0.4 | red | 11
(11 rows)
From there, we can figure out the minimum row in the previous five rows that has the same color:
with data as (select * from (values
(1,0.8 ,'red'),
(3,0.76 ,'blue'),
(4,0.75 ,'blue'),
(7,0.743 ,'green'),
(5,0.742 ,'red'),
(6,0.741 ,'green'),
(8,0.6 ,'blue'),
(9,0.5 ,'blue'),
(11,0.47 ,'green'),
(12,0.45 ,'blue'),
(10,0.4 ,'red')
) v(id, rank, color)
), all_rows as (
select id, rank, color,
row_number() over (order by rank desc) as row_id
FROM data
ORDER BY rank desc
)
select id, rank, color, row_id,
first_value(row_id) over (partition by color order by row_id range between 5
preceding and current row)
from all_rows
order by 5, rank desc;
id | rank | color | row_id | first_value
----+-------+-------+--------+-------------
1 | 0.8 | red | 1 | 1
5 | 0.742 | red | 5 | 1
3 | 0.76 | blue | 2 | 2
4 | 0.75 | blue | 3 | 2
8 | 0.6 | blue | 7 | 2
9 | 0.5 | blue | 8 | 3
7 | 0.743 | green | 4 | 4
6 | 0.741 | green | 6 | 4
11 | 0.47 | green | 9 | 4
12 | 0.45 | blue | 10 | 7
10 | 0.4 | red | 11 | 11
(11 rows)
This is close, but it's not quite right. We need to get the minimum of that minimum from the last 5 rows for each color. We'll need one more step:
with data as (select * from (values
(1,0.8 ,'red'),
(3,0.76 ,'blue'),
(4,0.75 ,'blue'),
(7,0.743 ,'green'),
(5,0.742 ,'red'),
(6,0.741 ,'green'),
(8,0.6 ,'blue'),
(9,0.5 ,'blue'),
(11,0.47 ,'green'),
(12,0.45 ,'blue'),
(10,0.4 ,'red')
) v(id, rank, color)
), all_rows as (
select id, rank, color,
row_number() over (order by rank desc) as row_id
FROM data
ORDER BY rank desc
), first_values as (
select id, rank, color, row_id,
first_value(row_id) over (partition by color order by row_id range between 5
preceding and current row)
from all_rows
order by 5, rank desc
)
select id, rank, color, row_id,
first_value(first_value) over (partition by color order by row_id range between 5 preceding and current row) as overall_rank
FROM first_values
ORDER BY overall_rank, rank desc
;
id | rank | color | row_id | overall_rank
----+-------+-------+--------+--------------
1 | 0.8 | red | 1 | 1
5 | 0.742 | red | 5 | 1
3 | 0.76 | blue | 2 | 2
4 | 0.75 | blue | 3 | 2
8 | 0.6 | blue | 7 | 2
9 | 0.5 | blue | 8 | 2
12 | 0.45 | blue | 10 | 2
7 | 0.743 | green | 4 | 4
6 | 0.741 | green | 6 | 4
11 | 0.47 | green | 9 | 4
10 | 0.4 | red | 11 | 11
(11 rows)
Upvotes: 1