catabolic
catabolic

Reputation: 37

Mutate case_when date time data in R

I am working with a nocturnal species and am trying to label GPS fixes recorded overnights by date and time periods. The GPS fixes were recorded between 19:00:00 one night and 05:00:00 the following morning for ~ 80 days. Each night period span two days (obviously). I want to organises the fixes by night as opposed to date so the animal movement isn't split in the middle of the active period. I am having issues getting this to work correctly. I would like to create a new column with an ID for a collection of days in a period (e.g. 1-week) but would also like to create a column for each separate night.

When I first read in the data it looks like the following (noting this is not my data but an example of 4 columns that exist within the dataset and its structure):

library(tidyverse)
library(magrittr)
library(lubridate)

data <- read_csv("data/gps-fixes.csv")
Fix_date_time Animal_ID Latitude Longitude
15/01/21 19:00 1 -20.67891 132.87652
15/01/21 20:00 1 -20.66652 132.88912
15/01/21 21:00 1 -20.68752 132.92312
15/01/21 22:00 1 -20.64652 132.88912
15/01/21 23:00 1 -20.70652 132.88812
16/01/21 00:00 1 -20.78652 132.88712
16/01/21 01:00 1 -20.78652 132.88712
16/01/21 02:00 1 -20.78652 132.88712
16/01/21 03:00 1 -20.78652 132.88712
16/01/21 04:00 1 -20.78652 132.88712
16/01/21 05:00 1 -20.78652 132.88712

Fix_date_time is read in as a col_character() and the other three variables are col_double() The date when first read in appears as DMY H:M

I then convert the date and time to date/time date using lubridate

data$Fix_date_time <- dmy_hm(data$Fix_date_time)

Which returns the following

Fix_date_time Animal_ID Latitude Longitude
2021-01-15 19:00:00 1 -20.67891 132.87652
2021-01-15 20:00:00 1 -20.66652 132.88912
2021-01-15 21:00:00 1 -20.68752 132.92312
2021-01-15 22:00:00 1 -20.64652 132.88912
2021-01-15 23:00:00 1 -20.70652 132.88812
2021-01-16 00:00:00 1 -20.78652 132.88712
2021-01-16 01:00:00 1 -20.78652 132.88712
2021-01-16 02:00:00 1 -20.78652 132.88712
2021-01-16 03:00:00 1 -20.78652 132.88712
2021-01-16 04:00:00 1 -20.78652 132.88712
2021-01-16 05:00:00 1 -20.78652 132.88712

The date time then displayed as a dttm vector.

If I then attempt to create a new column from a number of nights I get the incorrect output. If I enter the following code:

data %<>% 
mutate(period = case_when(
       (Fix_date_time >= 2021-01-15 19:00:00) & (Fix_date_time <= 2021-01-21 05:00:00) ~ "Period-1",
       (Fix_date_time >= 2021-01-21 19:00:00) & (Fix_date_time <= 2021-01-28 05:00:00) ~ "Period-2",
       (Fix_date_time >= 2021-01-29 19:00:00) & (Fix_date_time <= 2021-02-02 05:00:00) ~ "Period-3"))

I instead end up with the last GPS fixes in each period ending the on the night before the end date at around 8pm (e.g. for "Period-1" the last fixes are around 22:00 on 2021-01-21). However if make it <= 19:00:00 for the end date in each period it works. While this is a work around for now, I want to figure out what I am doing wrong to fix it in the future.

Many thanks :)

Upvotes: 2

Views: 7143

Answers (2)

Joon
Joon

Reputation: 91

I like to use the timetk library with its between_time function for cases like this. It is a tad bit more elegant than lubridate.

library(dplyr)
library(timetk)

data %>% 
  mutate(
    period = case_when(
      Fix_date_time %>% between_time('2021-01-15 19:00:00','2021-01-21 05:00:00') ~ 'Period-1',
      Fix_date_time %>% between_time('2021-01-21 19:00:00','2021-01-28 05:00:00') ~ 'Period-2',
      Fix_date_time %>% between_time('2021-01-29 19:00:00','2021-02-02 05:00:00') ~ 'Period-3',
  )
)

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389345

You need to compare the data with the same class. Try the following :

library(dplyr)
library(magrittr)
library(lubridate)

data %<>% 
  mutate(period = case_when(
    Fix_date_time >= ymd_hms('2021-01-15 19:00:00') & 
    Fix_date_time <= ymd_hms('2021-01-21 05:00:00') ~ "Period-1",
    Fix_date_time >= ymd_hms('2021-01-21 19:00:00') & 
    Fix_date_time <= ymd_hms('2021-01-28 05:00:00') ~ "Period-2",
    Fix_date_time >= ymd_hms('2021-01-29 19:00:00') & 
    Fix_date_time <= ymd_hms('2021-02-02 05:00:00') ~ "Period-3"))

Upvotes: 3

Related Questions