DanG
DanG

Reputation: 741

Select and filter data data from 03:00 am to 03:00 am next day

I am trying to find the first "ON" record that occurs between 03:00:00 and 03:00:00 next day for each combination of date and id (3 Am to next day 3 AM).

    #dummy data
df <- tibble::tribble(
       ~id, ~code, ~start_day, ~hhmmss,       ~end_time,
  "7050-1",               "ON",   20200227,      "000000", 20200227002400,
  "7050-1",             "SNOOZE",   20200227,        "002400", 20200227003400,
  "7050-1",              "OFF",   20200227,        "003400", 20200227003545,
  "7050-1",               "ON",   20200227,        "003545", 20200227004815,
  "7050-1",             "SLP",   20200227,        "004815", 20200227021400,
  "7050-1",             "SLP",   20200227,       "021400", 20200227073415,
  "7050-1",               "ON",   20200227,       "073415", 20200227164515,
  "7050-1",               "ON",   20200228,      "025936", 20200227045936,
   "265-1",               "ON",   20200227,      "000000", 20200227002400,
   "265-1",             "SNOOZE",   20200227,      "164515", 20200227165515,
   "265-1",              "OFF",   20200227,      "165515", 20200228025936,
   "265-1",               "ON",   20200228,      "023536", 20200228025536,
  "265-1",               "OFF",   20200228,      "025536", 20200228003000,
  "265-1",               "ON",   20200228,       "03000", 20200228035936,
  "265-2",               "OFF",   20200228,      "000000", 20200228180000,
  "265-2",               "OFF",   20200228,      "180000", 20200228235959,
  "265-2",               "ON",   20200229,       "000000", 20200229020000,
  )

Here is my try so far

df %>%
  mutate(
    time = format(strptime(hhmmss, format = "%H%M%S"), format = "%H:%M:%S"),
    time = hms::as.hms(time, format = "%H:%M:%S") ,
    date = ymd(start_day) ) %>%
  group_by(date,id) %>%
  filter(time >= hms::as.hms("02:59:59", format = "%H:%M:%S") & code == "ON")

But I shouldn't use filter like that since than I lose the 265-2record. Here is my desire output. (** 265-2 records occurs between shift 28-02-28 03:00:00 and 28-02-29 03:00:00 and should assign to date 28/02 not 29/02 . Hope it is clear

  id     code  date   time        

 7050-1 ON     20200227     07:34:15 
 265-1  ON     20200228     03:00:00 
 265-2  ON     20200228     00:00:00 

Upvotes: 0

Views: 55

Answers (1)

jyjek
jyjek

Reputation: 2707

idk, maybe I didn't understand your logic, but hope this would help you

df %>% 
   mutate(date = ymd(start_day),
          time = format(strptime(hhmmss, format = "%H%M%S"), format = "%H:%M:%S"),
          time = hms::as.hms(time, format = "%H:%M:%S")) %>% 
   mutate(new_date = as_datetime(glue::glue('{date} {time}')) - hours(3),
          new_date = as_date(new_date)) %>% 
   filter(code == "ON")
# A tibble: 8 x 8
  id     code  start_day hhmmss end_time date       time   new_date  
  <chr>  <chr>     <dbl> <chr>     <dbl> <date>     <drtn> <date>    
1 7050-1 ON     20200227 000000  2.02e13 2020-02-27 00:00  2020-02-26
2 7050-1 ON     20200227 003545  2.02e13 2020-02-27 00:35  2020-02-26
3 7050-1 ON     20200227 073415  2.02e13 2020-02-27 07:34  2020-02-27
4 7050-1 ON     20200228 025936  2.02e13 2020-02-28 02:59  2020-02-27
5 265-1  ON     20200227 000000  2.02e13 2020-02-27 00:00  2020-02-26
6 265-1  ON     20200228 023536  2.02e13 2020-02-28 02:35  2020-02-27
7 265-1  ON     20200228 03000   2.02e13 2020-02-28 03:00  2020-02-28
8 265-2  ON     20200229 000000  2.02e13 2020-02-29 00:00  2020-02-28

Upvotes: 1

Related Questions