Aunib
Aunib

Reputation: 28

Generate cumulative information based on an ordered list using SQL

I'd like to generate cumulative information based on an ordered list. In the example below, I'd like to generate the days of rain based on the other 3 columns. So ordering by town and day, then give null if it's dry, and a cumulative score if the weather is rain.

Procedurally this is quite easy, but it feels like there should be a way to generate it directly with sql and I just can't figure it out.

Possibly you can use analytic range windows, but I can't figure out how to do that across 3 columns. And the need to reset if there are intervening rows means I can't just ignore the 'dry' rows and just use row_number().

town       day weather  days of rain
Stevenage  1   dry      
Stevenage  2   dry      
Stevenage  3   rain     1
Stevenage  4   rain     2
Stevenage  4   rain     3
Stevenage  5   dry      
Stevenage  6   dry      
Stevenage  8   rain     1
Stevenage  9   rain     2
Stevenage  10  dry      
Watford    1   dry      
Watford    2   dry      
Watford    3   rain     1
Watford    4   rain     2





create table rain_test (town varchar2(20), day number, weather varchar2(10), days_of_rain number);

insert into rain_test(town, day, weather) values ('Stevenage', 1, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 2, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 3, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 4, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 5, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 6, 'dry');
insert into rain_test(town, day, weather) values ('Stevenage', 7, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 8, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 9, 'rain');
insert into rain_test(town, day, weather) values ('Stevenage', 10, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 1, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 2, 'dry');
insert into rain_test(town, day, weather) values ('Watford', 3, 'rain');
insert into rain_test(town, day, weather) values ('Watford', 4, 'rain');
commit;

Upvotes: 1

Views: 52

Answers (2)

Noel
Noel

Reputation: 10525

Using Analytical functions,

with x as (
  select town,
         day,
         weather,
         case when weather =                --lag function to find out when the weather changes.
                    lag(weather,1) over (partition by town order by day)
              then 0
              else 1
         end boundary
    from rain_test
),
y as (
  select town, day, weather,
         sum(boundary) over (partition by town order by day) grp    --Sum function to assign a unique group number to a sequence of same weather.
    from x
  )
select town, day, weather,
       case when weather = 'rain'
            then row_number() over (partition by town, grp order by day)    --row_number function to assign unique number to each row in a group.
       end
  from y
order by town, day;

Demo at sqlfiddle

Upvotes: 1

Miguel Veloso
Miguel Veloso

Reputation: 1095

there you go, just for the challenge, definitely much easier procedurally, but not that fun:

select town, day, weather, null days_of_rain
from   rain_test
where  weather = 'dry'
union
select day.town, day.day, weather, day.day - chain.begin_day + 1 days_of_rain
from   rain_test day,
       (
       select town, chain, sum(begin_day) begin_day, sum(end_day) end_day
       from   (/* to find the chains of rainy days */       
              select town, rownum chain, day begin_day, 0 end_day
              from   (/* to find the begining days */
                     select rain.town, rain.day 
                     from   rain_test rain,
                            rain_test day 
                     where  rain.town = day.town
                     and    rain.weather = 'rain'
                     and    day.day = rain.day - 1
                     and    day.weather = 'dry'
                     union /* border condition */
                     select town, min(day) day
                     from   rain_test
                     where  weather = 'rain'
                     group by town
                     order by town, day
                     )
              union all
              select town, rownum chain, 0 begin_day, day end_day
              from   (/* to find the ending days */       
                     select rain.town, rain.day 
                     from   rain_test rain,
                            rain_test day 
                     where  rain.town = day.town
                     and    rain.weather = 'rain'
                     and    day.day = rain.day + 1
                     and    day.weather = 'dry'
                     union /* border condition */
                     select town, max(day) day
                     from   rain_test
                     where  weather = 'rain'
                     group by town
                     order by town, day
                     )    
              )
       group by town, chain   
       ) chain
where  chain.town = day.town
and    chain.begin_day <= day.day
and    chain.end_day >= day.day          
order by town, day

Upvotes: 0

Related Questions