Fionn
Fionn

Reputation: 207

Create time bins and assign data to correct bin

I would like to create a sequence of 30 minute time bins over a 24 hour period which I have done using

seq(as.POSIXct("2018-03-25"). as.POSIXct("2018-03-26"), by = "30 min")

I have a set of data with specific times such as 25/03/2018 05:08 and 25/03/2018 18:39. I would like to be able to create a data frame with the list of time bins, and then 'present' or 'absent' depending whether a data point exists which falls within any of the time bins or not.

I thought that I could do this using interval with lubridate, but I haven't been able to create the sequence of bins. I had hoped to use %within% to match the data points to the bins but I am relatively new to R and am not able to do this.

My data are like as follows, with detections of sharks at different locations (station in dataset). In my actual data I have 41894 observations spanning a three month period and need to match these to the correct time bin for each day over the 3 month period.

detect_date        Station  
25/03/2018 00:09    SS01   
25/03/2018 01:17    SS03 
25/03/2016 14:37    SS04 
25/03/2016 23:43    SS04

The output I would like in the end would be something like as follows.

bin                Location  
25/03/2018 00:00    SS01 
25/03/2018 00:30   Absent 
25/03/2018 01:00    SS03

Would really appreciate any help!

Upvotes: 2

Views: 1329

Answers (1)

PavoDive
PavoDive

Reputation: 6496

I tried to solve this using data.table and lubridate and sticking to my idea of using floor_date.

# load packages
library(data.table)
library(lubridate)

# define a vector evenly spaced each 30 minutes:
b <- data.table(dates = seq(as.POSIXct("2018-03-25", tz = "UTC"), 
                            as.POSIXct("2018-03-26", tz = "UTC"), 
                            by = "30 min"))

# reproduce data
dt <- data.table(detect_date = as.character(c("25/03/2018 00:09", "25/03/2018 01:17", "25/03/2016 14:37", "25/03/2016 23:43")), 
                 Station = c("SS01", "SS03", "SS04", "SS04"), 
                 Individual = c("A", "B", "C", "B"))

# convert detect_date to date format
dt[, detect_date := dmy_hm(detect_date)]

# make a join
dt[, .(Location = Station, Individual), by = .(dates = floor_date(detect_date, "30 minutes"))][b, on = "dates"]

Upvotes: 2

Related Questions