Reputation: 28
I'd like to generate cumulative information based on an ordered list. In the example below, I'd like to generate the days of rain based on the other 3 columns. So ordering by town and day, then give null if it's dry, and a cumulative score if the weather is rain.
Procedurally this is quite easy, but it feels like there should be a way to generate it directly with sql and I just can't figure it out.
Possibly you can use analytic range windows, but I can't figure out how to do that across 3 columns. And the need to reset if there are intervening rows means I can't just ignore the 'dry' rows and just use row_number().
town day weather days of rain
Stevenage 1 dry
Stevenage 2 dry
Stevenage 3 rain 1
Stevenage 4 rain 2
Stevenage 4 rain 3
Stevenage 5 dry
Stevenage 6 dry
Stevenage 8 rain 1
Stevenage 9 rain 2
Stevenage 10 dry
Watford 1 dry
Watford 2 dry
Watford 3 rain 1
Watford 4 rain 2
create table rain_test (town varchar2(20), day number, weather varchar2(10), days_of_rain number);
insert into rain_test(town, day, weather) values ('Stevenage', 1, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 2, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 3, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 4, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 5, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 6, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 7, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 8, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 9, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 10, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 1, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 2, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 3, 'rain');
insert into rain_test(town, day, weather) values ('Watford', 4, 'rain');
commit;
Upvotes: 1
Views: 52
Reputation: 10525
Using Analytical functions,
with x as (
select town,
day,
weather,
case when weather = --lag function to find out when the weather changes.
lag(weather,1) over (partition by town order by day)
then 0
else 1
end boundary
from rain_test
),
y as (
select town, day, weather,
sum(boundary) over (partition by town order by day) grp --Sum function to assign a unique group number to a sequence of same weather.
from x
)
select town, day, weather,
case when weather = 'rain'
then row_number() over (partition by town, grp order by day) --row_number function to assign unique number to each row in a group.
end
from y
order by town, day;
Upvotes: 1
Reputation: 1095
there you go, just for the challenge, definitely much easier procedurally, but not that fun:
select town, day, weather, null days_of_rain
from rain_test
where weather = 'dry'
union
select day.town, day.day, weather, day.day - chain.begin_day + 1 days_of_rain
from rain_test day,
(
select town, chain, sum(begin_day) begin_day, sum(end_day) end_day
from (/* to find the chains of rainy days */
select town, rownum chain, day begin_day, 0 end_day
from (/* to find the begining days */
select rain.town, rain.day
from rain_test rain,
rain_test day
where rain.town = day.town
and rain.weather = 'rain'
and day.day = rain.day - 1
and day.weather = 'dry'
union /* border condition */
select town, min(day) day
from rain_test
where weather = 'rain'
group by town
order by town, day
)
union all
select town, rownum chain, 0 begin_day, day end_day
from (/* to find the ending days */
select rain.town, rain.day
from rain_test rain,
rain_test day
where rain.town = day.town
and rain.weather = 'rain'
and day.day = rain.day + 1
and day.weather = 'dry'
union /* border condition */
select town, max(day) day
from rain_test
where weather = 'rain'
group by town
order by town, day
)
)
group by town, chain
) chain
where chain.town = day.town
and chain.begin_day <= day.day
and chain.end_day >= day.day
order by town, day
Upvotes: 0