Jonathan Nolan
Jonathan Nolan

Reputation: 145

compare date variable with a list of dates

I have a df with a datetime variable (made with lubridate)

     str(raw_data$date)
 POSIXct[1:37166], format: "2016-11-04 09:12:38" "2016-11-04 09:04:08" "2016-11-04 09:04:14" "2016-11-04 09:08:01" "2016-11-04 09:11:56" ...

and a list of dates for a school term

vsdate<- c("2017/01/30","2017/03/31","2017/04/18","2017/06/30","2017/07/17","2017/09/22","2017/10/09","2017/12/22","2018/01/30","2018/03/29","2018/04/16","2018/06/29","2018/07/16","2018/09/21","2018/10/08","2018/12/21")

vsdate <- as_date(vsdate)

I want to compare if the dates in the list are between the dates in raw_data. I have done this below, but I can't get it to work in the tidyverse:

    vsdate<- c("2017/01/30","2017/03/31","2017/04/18","2017/06/30","2017/07/17","2017/09/22","2017/10/09","2017/12/22","2018/01/30","2018/03/29","2018/04/16","2018/06/29","2018/07/16","2018/09/21","2018/10/08","2018/12/21")

vsdate <- as.Date(vsdate)

raw_data$Vic.School.Term=0
raw_data[raw_data$date<=vsdate[2]& raw_data$date>=vsdate[1],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[4]& raw_data$date>=vsdate[3],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[6]& raw_data$date>=vsdate[5],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[8]& raw_data$date>=vsdate[7],"Vic.School.Term"]<-1 
raw_data[raw_data$date<=vsdate[10]& raw_data$date>=vsdate[9],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[12]& raw_data$date>=vsdate[11],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[14]& raw_data$date>=vsdate[13],"Vic.School.Term"]<-1 
raw_data[raw_data$date<vsdate[16]& raw_data$date>=vsdate[15],"Vic.School.Term"]<-1 

and here is my failed attempt in the tidyverse:

    raw_data<- raw_data <- mutate(school.term=case_when(
   between(date,vsdate[1],vsdate[2] ~ 1)))
Error in between(date, vsdate[1], vsdate[2] ~ 1) : 
  Expecting a single value: [extent=3].

Thanks!

Upvotes: 0

Views: 729

Answers (1)

Jrakru56
Jrakru56

Reputation: 1321

Your between function is not closed properly. The proper signature for it is between(value,left, right) and you have between(value, left, right ~1). See below for the 1st few cases:

library(dplyr)
library(lubridate)
raw_data <- data.frame( date = c("2016-11-04 09:12:38", "2016-11-04 09:04:08",
                                 "2016-11-04 09:04:14", "2016-11-04 09:08:01",
                                 "2016-11-04 09:11:56", "2017-02-15 09:10:01",
                                 "2017-05-01 10:00:00")
)



raw_data %>% mutate(date = ymd_hms(date)) -> raw_data

str(raw_data)

vsdate<- ymd(c("2017/01/30","2017/03/31","2017/04/18","2017/06/30",
           "2017/07/17","2017/09/22","2017/10/09","2017/12/22",
           "2018/01/30","2018/03/29","2018/04/16","2018/06/29",
           "2018/07/16","2018/09/21","2018/10/08","2018/12/21"))

str(vsdate)

raw_data %>% mutate(school.term = case_when(between(as.Date(date), vsdate[1], vsdate[2]) ~1, 
                                            between(as.Date(date), vsdate[3], vsdate[4]) ~1, 
                                            TRUE ~ 0)

                 date school.term
1 2016-11-04 09:12:38           0
2 2016-11-04 09:04:08           0
3 2016-11-04 09:04:14           0
4 2016-11-04 09:08:01           0
5 2016-11-04 09:11:56           0
6 2017-02-15 09:10:01           1
7 2017-05-01 10:00:00           1

Also, note the as.Date function in the between. This allows the comparison between POSIXct and regular date format in R

Upvotes: 1

Related Questions