How do I calculate the duration for multiple continuous occurrence of an event?

Question

This is a subset of my data

    # | Event | Time
    1    A       22:00:00
    2    B       22:00:10
    3    B       22:00:20
    4    B       22:00:30
    5    C       22:00:40
    6    B       22:00:10
    7    B       22:00:20
    8    B       22:00:30
    9    A       22:00:40

I want to calculate the duration of a continuous occurrence of a same event – so, say I want the duration of event B, the output should be

    # | Event | Time      | Duration
    1    A       22:00:00   NA
    2    B       22:00:10   20 Secs
    3    B       22:00:20   NA
    4    B       22:00:30   NA
    5    C       22:00:40   NA
    6    B       22:00:10   20 Secs
    7    B       22:00:20   NA
    8    B       22:00:30   NA
    9    A       22:00:40   NA

I've tried (variations of) solutions from: Using conditional statements with difftime in R and Calulcate running difference of time using difftime on one column of timestamps

But I can't seem to get what I'm trying to achieve. Hope I can get some help! Thanks!

akrun · Accepted Answer

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), create a grouping variable based on the run-length-id of 'Event' i.e. based on same adjacent elements ('grp'), convert the 'Time' to POSIXct , grouped by 'grp', specifying the 'i' with the logical condition (Event == "B"), find the difference of the last and first 'Time1' using difftime, then we assign the values of the 'Duration' that is not the first observation for each 'grp' to NA.

library(data.table)
setDT(df1)[, grp := rleid(Event)][, Time1 := as.POSIXct(Time, format = 
       "%H:%M:%S")][Event == "B", 
     Duration := as.numeric(difftime(Time1[.N], Time1[1], unit = "secs")), grp]
df1[df1[, .I[seq_len(.N) != 1], grp]$V1, Duration := NA][, c("Time1", "grp") := NULL][]
#   No Event     Time Duration
#1:  1     A 22:00:00       NA
#2:  2     B 22:00:10       20
#3:  3     B 22:00:20       NA
#4:  4     B 22:00:30       NA
#5:  5     C 22:00:40       NA
#6:  6     B 22:00:10       20
#7:  7     B 22:00:20       NA
#8:  8     B 22:00:30       NA
#9:  9     A 22:00:40       NA

data

df1 <- structure(list(No = 1:9, Event = c("A", "B", "B", "B", "C", "B", 
 "B", "B", "A"), Time = c("22:00:00", "22:00:10", "22:00:20", 
 "22:00:30", "22:00:40", "22:00:10", "22:00:20", "22:00:30", "22:00:40"
 )), .Names = c("No", "Event", "Time"), class = "data.frame", row.names = c(NA, 
-9L))

How do I calculate the duration for multiple continuous occurrence of an event?

Answers (1)

data

Related Questions