cka123
cka123

Reputation: 33

R function for expand table by each month, given start month and end month

I have a dataset in R related to the effective start and end month (duration) of price. Below is an example:

print(df)

    Customer Product Price Start_Month End_Month
    ABC      XYZ     100   Jan         Jun
    ABC      XYZ     150   Jul         Dec

I need to write code/function that can expend this table by each month. My expected result is as below:

Customer Product Price Month
ABC      XYZ     100   Jan
ABC      XYZ     100   Feb
ABC      XYZ     100   Mar
ABC      XYZ     100   Apr
ABC      XYZ     100   May
ABC      XYZ     100   Jun
ABC      XYZ     150   Jul
ABC      XYZ     150   Aug
ABC      XYZ     150   Sep
ABC      XYZ     150   Oct
ABC      XYZ     150   Nov
ABC      XYZ     150   Dec

I couldn't find other similar questions that can help me with my problem.

Upvotes: 2

Views: 167

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388947

A tidyverse approach would be to convert dataframe to long format so that we now have two rows for every one row in original dataframe. Now for every row we use match to get the index of the month value in start and end month and then using that create a sequence.

library(tidyverse)

df %>%
  gather(key, Month, -(1:3)) %>%
  group_by_at(1:3) %>%
  complete(Month = month.abb[match(Month[1], month.abb):
                             match(Month[2], month.abb)]) %>%
  arrange(Customer, Product, Price, match(Month, month.abb)) %>%
  select(-key)


#  Customer Product Price Month
#   <fct>    <fct>   <int> <chr>
# 1 ABC      XYZ       100 Jan  
# 2 ABC      XYZ       100 Feb  
# 3 ABC      XYZ       100 Mar  
# 4 ABC      XYZ       100 Apr  
# 5 ABC      XYZ       100 May  
# 6 ABC      XYZ       100 Jun  
# 7 ABC      XYZ       150 Jul  
# 8 ABC      XYZ       150 Aug  
# 9 ABC      XYZ       150 Sep  
#10 ABC      XYZ       150 Oct  
#11 ABC      XYZ       150 Nov  
#12 ABC      XYZ       150 Dec  

Or another option using map2

df %>%
  mutate(Month = map2(Start_Month, End_Month,
                ~month.abb[match(.x, month.abb) : match(.y, month.abb)])) %>%
  unnest() %>%
  select(-Start_Month, -End_Month)

which in base R would be using Map

do.call(rbind, Map(function(x, y, z) cbind(df[z,], 
       Month = month.abb[match(x, month.abb) : match(y, month.abb)]),
   df$Start_Month, df$End_Month, seq_len(nrow(df))))

Here, we exploit the inbuilt month.abb vector to get the sequence

month.abb
# [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"

Upvotes: 3

LocoGris
LocoGris

Reputation: 4480

Using base r:

do.call(rbind,lapply(1:nrow(df), 
                             function(x) {
                                      cbind(df[x], 
                                      data.frame(Months=
                                                month.abb[which(month.abb==df[x]$Start_Month):
                                                          which(month.abb==df[x]$End_Month)]))
                                }))

Upvotes: 1

Related Questions