Reputation: 1
I am brand new to R, and am having trouble figuring out how to set up a simple time series. Illustration: say I have three variables: Event (0 or 1), HR (heart rate), DT (datetime):
df = data.frame(Event = c(1,0,0,0,1,0,0),
HR= c(100,120,115,105,105,115,100),
DT= c("2020-01-01 09:00:00","2020-01-01 09:15:00","2020-01-01 10:00:00","2020-01-01 10:30:00",
"2020-01-01 11:00:00","2020-01-01 12:00:00","2020-01-01 13:00:00"),
stringsAsFactors = F
)
Event HR DT
1 1 100 2020-01-01 09:00:00
2 0 120 2020-01-01 09:15:00
3 0 115 2020-01-01 10:00:00
4 0 105 2020-01-01 10:30:00
5 1 105 2020-01-01 11:00:00
6 0 115 2020-01-01 12:00:00
7 0 100 2020-01-01 13:00:00
What I would like to do is to calculate elapsed time after each new event: So, row1=0 min, row2=15, row3=60,... row5=0, row6=60 Then I can do things like plot HR vs elapsed.
What might be a simple way to calculate elapsed time? Apologies for such a low level question, but would be very grateful for any help!
Upvotes: 0
Views: 64
Reputation: 1044
Welcome to Stack Overflow @greyguy.
Here is an approach with dplyr
library wich is pretty good with large data sets:
library(dplyr)
#Yours Data
df = data.frame(Event = c(1,0,0,0,1,0,0),
HR= c(100,120,115,105,105,115,100),
DT= c("2020-01-01 09:00:00","2020-01-01 09:15:00","2020-01-01 10:00:00","2020-01-01 10:30:00",
"2020-01-01 11:00:00","2020-01-01 12:00:00","2020-01-01 13:00:00"),
stringsAsFactors = F
)
# Transform in time format not string and order by time if not ordered
df = df %>%
mutate(DT = as.POSIXct(DT, format = "%Y-%m-%d %H:%M:%S")) %>%
arrange(DT) %>%
mutate(#Litte trick to get last DT Observation
last_DT = case_when(Event==1 ~ DT),
last_DT = na.locf(last_DT),
Elapsed_min = as.numeric( (DT - last_DT)/60)
) %>%
select(-last_DT)
The output:
# Event HR DT Elapsed_min
# 1 100 2020-01-01 09:00:00 0
# 0 120 2020-01-01 09:15:00 15
# 0 115 2020-01-01 10:00:00 60
# 0 105 2020-01-01 10:30:00 90
# 1 105 2020-01-01 11:00:00 0
# 0 115 2020-01-01 12:00:00 60
# 0 100 2020-01-01 13:00:00 120
Upvotes: 0
Reputation: 4358
The following uses the Chron
library and converts your date/time column to time objects for the library to be able to run calculations and conversions on.
Example Data:
df <- data.frame(
Event=c(1,0,0,0,1,0,0),
HR=c(100,125,115,105,105,115,100),
DT=c("2020-01-01 09:00:00"
,"2020-01-01 09:15:00"
,"2020-01-01 10:00:00"
,"2020-01-01 10:30:00"
,"2020-01-01 11:00:00"
,"2020-01-01 12:00:00"
,"2020-01-01 13:00:00"))
Code:
library(chron)
Dates <- lapply(strsplit(as.character(df$DT)," "),head,n=1)
Times <- lapply(strsplit(as.character(df$DT)," "),tail,n=1)
df$DT <- chron(as.character(Dates),as.character(Times),format=c(dates="y-m-d",times="h:m:s"))
df$TimeElapsed[1] <- 0
for(i in 1:nrow(df)){
if(df$Event[i]==1){TimeStart <- df$DT[i]}
df$TimeElapsed[i] <- (df$DT[i]-TimeStart)*24*60
}
output:
> df
Event HR DT TimeElapsed
1 1 100 (20-01-01 09:00:00) 0
2 0 125 (20-01-01 09:15:00) 15
3 0 115 (20-01-01 10:00:00) 60
4 0 105 (20-01-01 10:30:00) 90
5 1 105 (20-01-01 11:00:00) 0
6 0 115 (20-01-01 12:00:00) 60
7 0 100 (20-01-01 13:00:00) 120
Upvotes: 0
Reputation: 24878
Here is a one line approach using data.table.
Data:
df <- structure(list(Event = c(1, 0, 0, 0, 1, 0, 0), HR = c(100, 120,
115, 105, 105, 115, 100), DT = structure(c(1577869200, 1577870100,
1577872800, 1577874600, 1577876400, 1577880000, 1577883600), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), row.names = c(NA, -7L), class = "data.frame")
Code:
library(data.table)
dt <- as.data.table(df)
dt[, mins_since_last_event := as.numeric(difftime(DT,DT[1],units = "mins")), by = .(cumsum(Event))]
Output:
dt
Event HR DT mins_since_last_event
1: 1 100 2020-01-01 09:00:00 0
2: 0 120 2020-01-01 09:15:00 15
3: 0 115 2020-01-01 10:00:00 60
4: 0 105 2020-01-01 10:30:00 90
5: 1 105 2020-01-01 11:00:00 0
6: 0 115 2020-01-01 12:00:00 60
7: 0 100 2020-01-01 13:00:00 120
Upvotes: 1