SMN
SMN

Reputation: 21

How to generate date sequence between two dates in same column in R

I am struggling to generate a date sequence between two dates in same column using R script.

I have request id and sequence ID, Date and status. Input table

My requirement is to generate table like this. desired output table

Any help in this regard would be appreciated.

Thank you

Upvotes: 1

Views: 2247

Answers (2)

SMN
SMN

Reputation: 21

Thank you Jasbner! I have installed dplyr and tidyr packages as suggested. I am using 'mutate' to fix the date format.

my csv file (file.csv) holds these data lines

ReqID Seq Created Status

100 1 01/01/2018 Scheduled

100 2 10/01/2018 Execution

100 3 15/01/2018 Hold

100 4 18/01/2018 Complete

101 1 10/01/2018 Scheduled

101 2 18/01/2018 Execution

101 3 20/01/2018 Complete

102 1 18/01/2018 Scheduled

102 2 22/01/2018 Execution

102 3 25/01/2018 Cancelled

103 1 01/02/2018 Scheduled


# my final r script

mydata<-read.csv('file.csv') # Reading data from csv

myindf<-as.data.frame(mydata) # converting it into data frame

myoutdf <- myindf %>% mutate(Created = dmy(Created)) %>% group_by(ReqID) %>% complete(Created = seq.Date(min(Created),max(Created), by = "day")) %>% fill(ReqID,Seq,Status)

print(myoutdf, n = 38) #print all 38 lines

Upvotes: 0

jasbner
jasbner

Reputation: 2283

You can do this with the tidyverse libraries. First set your date column to dates with dmy in the lubridate package. Then you can use tidyr functions complete and fill to extend your datatable as shown. complete has the option to fill in the gaps by day. group_by ReqID to do this for each of your individual identifiers.

library(tidyverse)
library(lubridate)
df <- data_frame(ReqID = 100, ID_Seq = 1:3, Created = dmy("01/01/2018","10/01/2018","18/01/2018"), Status = c("Scheduled","In Execution", "Completed"))
df %>%
  group_by(ReqID) %>% 
  complete(Created = seq.Date(min(Created),max(Created), by = "day")) %>%
  fill(ReqID,ID_Seq,Status)

## A tibble: 18 x 4
#   Created    ReqID ID_Seq Status      
#   <date>     <dbl>  <int> <chr>       
# 1 2018-01-01   100      1 Scheduled   
# 2 2018-01-02   100      1 Scheduled   
# 3 2018-01-03   100      1 Scheduled   
# 4 2018-01-04   100      1 Scheduled   
# 5 2018-01-05   100      1 Scheduled   
# 6 2018-01-06   100      1 Scheduled   
# 7 2018-01-07   100      1 Scheduled   
# 8 2018-01-08   100      1 Scheduled   
# 9 2018-01-09   100      1 Scheduled   
#10 2018-01-10   100      2 In Execution
#11 2018-01-11   100      2 In Execution
#12 2018-01-12   100      2 In Execution
#13 2018-01-13   100      2 In Execution
#14 2018-01-14   100      2 In Execution
#15 2018-01-15   100      2 In Execution
#16 2018-01-16   100      2 In Execution
#17 2018-01-17   100      2 In Execution
#18 2018-01-18   100      3 Completed  

Upvotes: 4

Related Questions