greyguy
greyguy

Reputation: 1

Beginner: set up time series in R

I am brand new to R, and am having trouble figuring out how to set up a simple time series. Illustration: say I have three variables: Event (0 or 1), HR (heart rate), DT (datetime):

df = data.frame(Event = c(1,0,0,0,1,0,0),
                HR= c(100,120,115,105,105,115,100),
                DT= c("2020-01-01 09:00:00","2020-01-01 09:15:00","2020-01-01 10:00:00","2020-01-01 10:30:00",
                      "2020-01-01 11:00:00","2020-01-01 12:00:00","2020-01-01 13:00:00"),
                stringsAsFactors = F
)
  Event    HR DT                             
1     1   100 2020-01-01 09:00:00
2     0   120 2020-01-01 09:15:00
3     0   115 2020-01-01 10:00:00
4     0   105 2020-01-01 10:30:00
5     1   105 2020-01-01 11:00:00
6     0   115 2020-01-01 12:00:00
7     0   100 2020-01-01 13:00:00

What I would like to do is to calculate elapsed time after each new event: So, row1=0 min, row2=15, row3=60,... row5=0, row6=60 Then I can do things like plot HR vs elapsed.

What might be a simple way to calculate elapsed time? Apologies for such a low level question, but would be very grateful for any help!

Upvotes: 0

Views: 64

Answers (3)

vpz
vpz

Reputation: 1044

Welcome to Stack Overflow @greyguy. Here is an approach with dplyr library wich is pretty good with large data sets:

library(dplyr)

#Yours Data

df = data.frame(Event = c(1,0,0,0,1,0,0),
                HR= c(100,120,115,105,105,115,100),
                DT= c("2020-01-01 09:00:00","2020-01-01 09:15:00","2020-01-01 10:00:00","2020-01-01 10:30:00",
                      "2020-01-01 11:00:00","2020-01-01 12:00:00","2020-01-01 13:00:00"),
                stringsAsFactors = F
)

# Transform in time format not string and order by time if not ordered

Transform in time format not string and order by time if not ordered

df = df %>% 
      mutate(DT = as.POSIXct(DT, format = "%Y-%m-%d %H:%M:%S")) %>% 
      arrange(DT) %>%
      mutate(#Litte trick to get last DT Observation
             last_DT = case_when(Event==1 ~ DT),
             last_DT = na.locf(last_DT),
             Elapsed_min = as.numeric( (DT - last_DT)/60)
             ) %>%
      select(-last_DT)

The output:

# Event    HR                    DT   Elapsed_min
#     1   100   2020-01-01 09:00:00             0
#     0   120   2020-01-01 09:15:00            15
#     0   115   2020-01-01 10:00:00            60
#     0   105   2020-01-01 10:30:00            90
#     1   105   2020-01-01 11:00:00             0
#     0   115   2020-01-01 12:00:00            60
#     0   100   2020-01-01 13:00:00           120

Upvotes: 0

Daniel O
Daniel O

Reputation: 4358

The following uses the Chron library and converts your date/time column to time objects for the library to be able to run calculations and conversions on.

Example Data:

df <- data.frame(
  Event=c(1,0,0,0,1,0,0),
  HR=c(100,125,115,105,105,115,100),
  DT=c("2020-01-01 09:00:00"
      ,"2020-01-01 09:15:00"
      ,"2020-01-01 10:00:00"
      ,"2020-01-01 10:30:00"
      ,"2020-01-01 11:00:00"
      ,"2020-01-01 12:00:00"
      ,"2020-01-01 13:00:00"))

Code:

library(chron)

Dates <- lapply(strsplit(as.character(df$DT)," "),head,n=1)
Times <- lapply(strsplit(as.character(df$DT)," "),tail,n=1)

df$DT <- chron(as.character(Dates),as.character(Times),format=c(dates="y-m-d",times="h:m:s"))

df$TimeElapsed[1] <- 0

for(i in 1:nrow(df)){
  if(df$Event[i]==1){TimeStart <- df$DT[i]}
  df$TimeElapsed[i] <- (df$DT[i]-TimeStart)*24*60
}

output:

> df
  Event  HR                  DT TimeElapsed
1     1 100 (20-01-01 09:00:00)           0
2     0 125 (20-01-01 09:15:00)          15
3     0 115 (20-01-01 10:00:00)          60
4     0 105 (20-01-01 10:30:00)          90
5     1 105 (20-01-01 11:00:00)           0
6     0 115 (20-01-01 12:00:00)          60
7     0 100 (20-01-01 13:00:00)         120

Upvotes: 0

Ian Campbell
Ian Campbell

Reputation: 24878

Here is a one line approach using data.table.

Data:

df <- structure(list(Event = c(1, 0, 0, 0, 1, 0, 0), HR = c(100, 120, 
115, 105, 105, 115, 100), DT = structure(c(1577869200, 1577870100, 
1577872800, 1577874600, 1577876400, 1577880000, 1577883600), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")), row.names = c(NA, -7L), class = "data.frame")

Code:

library(data.table)
dt <- as.data.table(df)
dt[, mins_since_last_event := as.numeric(difftime(DT,DT[1],units = "mins")), by = .(cumsum(Event))]

Output:

dt
   Event  HR                  DT mins_since_last_event
1:     1 100 2020-01-01 09:00:00                     0
2:     0 120 2020-01-01 09:15:00                    15
3:     0 115 2020-01-01 10:00:00                    60
4:     0 105 2020-01-01 10:30:00                    90
5:     1 105 2020-01-01 11:00:00                     0
6:     0 115 2020-01-01 12:00:00                    60
7:     0 100 2020-01-01 13:00:00                   120

Upvotes: 1

Related Questions