Badri Narayanan
Badri Narayanan

Reputation: 23

what is the R code for splitting columns and adding a date

I have a input file like this

Product Days SALES
Google MTWTFSASU 28000
Google TFSASU 10000

I need the data to split like this. The date here is a start_date value which I input from a txt file . LEts consider start_date="01/01/2021". Then the first occurrence should have the date as "01/01/2021" for Monday and then the consecutive days based on the day(Monday-Sunday).

Product Days SALES NO_OF_DAYS DATE
Google M 4000 7 01/01/2021
Google T 4000 7 01/02/2021
Google W 4000 7 01/03/2021
Google T 4000 7 01/04/2021
Google F 4000 7 01/05/2021
Google SA 4000 7 01/06/2021
Google SU 4000 7 01/07/2021
Apple T 2500 4 01/02/2021
Apple F 2500 4 01/05/2021
Apple SA 2500 4 01/06/2021
Apple SU 2500 4 01/07/2021

Upvotes: 0

Views: 27

Answers (1)

akrun
akrun

Reputation: 886938

One option is to create a key/value dataset that maps the 'Days' with the 'DATE' and after the unnesting step do a join

library(dplyr)
library(tidyr)
library(stringr)
keydat <- tibble(Days = c("M", "T", "W", "TH", "F", "SA", "SU"),
    DATE = format(seq(as.Date('2021-01-01'),
        length.out = length(Days), by = '1 day'), '%m/%d/%Y'))

df1 %>% 
   mutate(Days = str_extract_all(Days, "M|TH|W|T|F|SA|SU"),
       No_of_Days = lengths(Days),
       SALES = SALES/No_of_Days) %>% 
       unnest(c(Days)) %>%
       left_join(keydat)

-output

# A tibble: 11 x 5
#   Product Days  SALES No_of_Days DATE      
#   <chr>   <chr> <dbl>      <int> <chr>     
# 1 Google  M      4000          7 01/01/2021
# 2 Google  T      4000          7 01/02/2021
# 3 Google  W      4000          7 01/03/2021
# 4 Google  TH     4000          7 01/04/2021
# 5 Google  F      4000          7 01/05/2021
# 6 Google  SA     4000          7 01/06/2021
# 7 Google  SU     4000          7 01/07/2021
# 8 Apple   TH     2500          4 01/04/2021
# 9 Apple   F      2500          4 01/05/2021
#10 Apple   SA     2500          4 01/06/2021
#11 Apple   SU     2500          4 01/07/2021

data

df1 <- structure(list(Product = c("Google", "Apple"), Days = c("MTWTHFSASU", 
"THFSASU"), SALES = c(28000L, 10000L)), row.names = c(NA, -2L
), class = "data.frame")

Upvotes: 1

Related Questions