FMM
FMM

Reputation: 2035

Calculate total elapsed time

I have the following data:

library(dplyr, warn.conflicts = FALSE)

df <- tibble(
  x = c(30, 60, 90, 30, 60, 90),
  phase = c(rep(c("phase 1", "phase 2"), each = 3))
)

df
#> # A tibble: 6 x 2
#>       x phase  
#>   <dbl> <chr>  
#> 1    30 phase 1
#> 2    60 phase 1
#> 3    90 phase 1
#> 4    30 phase 2
#> 5    60 phase 2
#> 6    90 phase 2

Created on 2020-08-11 by the reprex package (v0.3.0)

Where x is the elapsed time (in seconds) within each phase. Since phase is something that happens continuously, I am interested in calculating the total elapsed time.

Desired output:

#> # A tibble: 6 x 3
#>       x phase   elapsed_time
#>   <dbl> <chr>          <dbl>
#> 1    30 phase 1           30
#> 2    60 phase 1           60
#> 3    90 phase 1           90
#> 4    30 phase 2          120
#> 5    60 phase 2          150
#> 6    90 phase 2          180

Any ideas? Please, note that my real example has much more phases.

Upvotes: 0

Views: 155

Answers (3)

Richard Telford
Richard Telford

Reputation: 9933

Here is another dplyr solution. It finds the start of each phase, and adds this to x

library(tidyverse)
df <- tibble(
  x = c(30, 60, 90, 30, 60, 90),
  phase = c(rep(c("phase 1", "phase 2"), each = 3))
)

df %>% group_by(phase) %>% 
  nest() %>% 
  mutate(start = map_dbl(data, max)) %>% 
  ungroup() %>% 
  mutate(start = lag(start, default = 0)) %>% 
  unnest(data) %>% 
  mutate(elapsed_time = start + x) %>% 
  select(-start)
#> # A tibble: 6 x 3
#>   phase       x elapsed_time
#>   <chr>   <dbl>        <dbl>
#> 1 phase 1    30           30
#> 2 phase 1    60           60
#> 3 phase 1    90           90
#> 4 phase 2    30          120
#> 5 phase 2    60          150
#> 6 phase 2    90          180

Created on 2020-08-11 by the reprex package (v0.3.0)

Upvotes: 1

wagscode
wagscode

Reputation: 41

I believe the following post has the answer you're looking for:

How to add a cumulative column to an R dataframe using dplyr?

It shows how to create a cumulative column using group_by and mutate. It appears you want the elapsed time to sum across both phases, so simply do not include the group_by call in your code.

Upvotes: 4

Sotos
Sotos

Reputation: 51622

Here is an idea via dplyr. First we need to group by the phase and get the time differences for each. We then ungroup() and take the cumsum() as a total, i.e.

library(dplyr)

df %>% 
 group_by(phase) %>% 
 mutate(diffs = x - lag(x, default = 0)) %>% 
 ungroup() %>% 
 mutate(res = cumsum(diffs)) %>% 
 select(-diffs)

# A tibble: 6 x 3
#      x phase     res
#  <dbl> <chr>   <dbl>
#1    30 phase 1    30
#2    60 phase 1    60
#3    90 phase 1    90
#4    30 phase 2   120
#5    60 phase 2   150
#6    90 phase 2   180

Upvotes: 3

Related Questions