Bananas
Bananas

Reputation: 3

R Difference in time between rows

I've triangulated information from other SO answers for the below code, but getting stuck with an error message. Searched SO for similar errors and resolutions but haven't been able to figure it out, so help is appreciated.

For every group ("id"), I want to get the difference between the start times for consecutive rows.

Reproducible data:

require(dplyr)
df <-data.frame(id=as.numeric(c("1","1","1","2","2","2")), 
            start= c("1/31/17 10:00","1/31/17 10:02","1/31/17 10:45", 
                             "2/10/17 12:00", "2/10/17 12:20","2/11/17 09:40"))
time <- strptime(df$start, format = "%m/%d/%y %H:%M")
df %>%
group_by(id)%>%
mutate(diff = time - lag(time),
     diff_mins = as.numeric(diff, units = 'mins'))

Gets me error:

Error in mutate_impl(.data, dots) : Column diff must be length 3 (the group size) or one, not 6 In addition: Warning message: In unclass(time1) - unclass(time2) : longer object length is not a multiple of shorter object length

Upvotes: 0

Views: 1348

Answers (2)

Maurits Evers
Maurits Evers

Reputation: 50668

Do you mean something like this?

There is no need for lag here, a simple diff on the grouped times is sufficient.

df %>%
    mutate(start = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>%
    group_by(id) %>%
    mutate(diff = c(0, diff(start)))
## A tibble: 6 x 3
## Groups:   id [2]
#     id start                diff
#  <dbl> <dttm>              <dbl>
#1    1. 2017-01-31 10:00:00    0.
#2    1. 2017-01-31 10:02:00    2.
#3    1. 2017-01-31 10:45:00   43.
#4    2. 2017-02-10 12:00:00    0.
#5    2. 2017-02-10 12:20:00   20.
#6    2. 2017-02-11 09:40:00 1280.

Upvotes: 1

andrew_reece
andrew_reece

Reputation: 21264

You can use lag and difftime (per Hadley):

df %>%
  mutate(time = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>%
  group_by(id) %>%
  mutate(diff = difftime(time, lag(time)))

# A tibble: 6 x 4
# Groups:   id [2]
     id start         time                diff  
  <dbl> <fct>         <dttm>              <time>
1    1. 1/31/17 10:00 2017-01-31 10:00:00 <NA>  
2    1. 1/31/17 10:02 2017-01-31 10:02:00 2     
3    1. 1/31/17 10:45 2017-01-31 10:45:00 43    
4    2. 2/10/17 12:00 2017-02-10 12:00:00 <NA>  
5    2. 2/10/17 12:20 2017-02-10 12:20:00 20    
6    2. 2/11/17 09:40 2017-02-11 09:40:00 1280  

Upvotes: 0

Related Questions