Sparky
Sparky

Reputation: 359

How to add new night-time/day-time variable depending on time range between two variables

I wanted to add a new variable depending on the time range between two variables. I want times between 8:01-20:00 = day and times between 20:01-8:00 = night, and anything that overlaps both to be mixed.

I've tried to add the variable manually but trying to understand can this be done an easier way.

#Current database
id<-c("m1","m1","m1","m2","m2","m2","m3","m4","m4")
x<-c("2020-01-03 10:00:00","2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-05 10:00:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-06 06:30:00","2020-01-08 06:30:00","2020-01-08 07:50:00")
start<-strptime(x,"%Y-%m-%d %H:%M:%S")
y<-c("2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-03 20:50:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-05 22:00:00","2020-01-06 07:40:00","2020-01-08 07:50:00","2020-01-08 08:55:00")
end<-strptime(y,"%Y-%m-%d %H:%M:%S")
mydata<-data.frame(id,start,end)

#output
day.night<-c("day","day","mixed","day","mixed","night","night","night","mixed")
newdata<-cbind(mydata,day.night)

Edit: Apologies I forgot to add the date.

Upvotes: 0

Views: 396

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389155

One way using dplyr is to convert start.time and end.time to POSIXct object and then compare the values at various intervals and apply labels using case_when.

library(dplyr)

data %>%
   mutate(start.time1 = as.POSIXct(start.time, format = "%H:%M"), 
          end.time1 =  as.POSIXct(end.time, format = "%H:%M"), 
          day.night =  case_when(
          start.time1 > as.POSIXct('08:01:00', format = "%T") &
          end.time1 < as.POSIXct('20:00:00', format = "%T") ~"day",
          start.time1 > as.POSIXct('20:01:00', format = "%T") |
          start.time1 < as.POSIXct('08:00:00', format = "%T") & 
          end.time1 < as.POSIXct('08:00:00', format = "%T") ~ "night",
          TRUE ~ "mixed")) %>%
   select(names(data), day.night)

#  id start.time end.time day.night
#1 m1      10:00    16:00       day
#2 m1      16:00    19:20       day
#3 m1      19:20    20:50     mixed
#4 m2      10:00    15:20       day
#5 m2      15:20    20:50     mixed
#6 m2      20:50    22:00     night
#7 m3      06:30    07:40     night
#8 m4      06:30    07:50     night
#9 m4      07:50    08:55     mixed

EDIT

If we also have date one way would be to replace the date component in start and end with todays date for comparison purposes.

library(dplyr)
library(lubridate)

mydata %>%
   mutate_at(vars(start, end), ymd_hms) %>%
   mutate(start_hour = hour(start), 
          end_hour = hour(end), 
    day.night = case_when(start_hour >= 8 & end_hour >= 8 & end_hour < 20 ~ "day", 
                          start_hour >= 20 &  (end_hour < 8 | end_hour <= 23) |
                         (start_hour < 8 & end_hour < 8)~ "night", 
                           TRUE ~ "mixed"))

#    id               start                 end day.night
#1 m1 2020-01-03 10:00:00 2020-01-03 16:00:00       day
#2 m1 2020-01-03 16:00:00 2020-01-03 19:20:00       day
#3 m1 2020-01-03 19:20:00 2020-01-03 20:50:00     mixed
#4 m2 2020-01-05 10:00:00 2020-01-05 15:20:00       day
#5 m2 2020-01-05 15:20:00 2020-01-05 20:50:00     mixed
#6 m2 2020-01-05 20:50:00 2020-01-05 22:00:00     night
#7 m3 2020-01-06 06:30:00 2020-01-06 07:40:00     night
#8 m4 2020-01-08 06:30:00 2020-01-08 07:50:00     night
#9 m4 2020-01-08 07:50:00 2020-01-08 08:55:00     mixed

Upvotes: 2

Related Questions