Martijn
Martijn

Reputation: 129

R- Conditional calculation based on values in other row and column

My data has the following format: - first column: indication if the machine is running - second column: total time that the machine is running

See here below the dataset:

structure(c("", "running", "running", "running", "", "", "", 
"running", "running", "", "10", "15", "30", "2", "5", "17", "47", 
"12", "57", "87"), .Dim = c(10L, 2L), .Dimnames = list(NULL, 
    c("c", "v")))

I would like to add a third column that gives the total time that the machine has been running (by adding all the times since the machine started to run). See here below the desired output:

 [1,] ""        "10" "0"   
 [2,] "running" "15" "15"  
 [3,] "running" "30" "45"  
 [4,] "running" "2"  "47"  
 [5,] ""        "5"  "0"   
 [6,] ""        "17" "0"   
 [7,] ""        "47" "0"   
 [8,] "running" "12" "12"  
 [9,] "running" "57" "69"  
[10,] ""        "87" "0" 

I tried to write some code in R to get this in an elegant way, but my programming skills are too limited for the moment. Is there anybody that knows a solution for this problem? Thank you on beforehand!

Upvotes: 0

Views: 662

Answers (3)

akrun
akrun

Reputation: 887118

We could use dplyr

library(dplyr)
 DF %>% 
   group_by(cumsum(c==''),c) %>%
   mutate(total=replace(cumsum(v), c=='', 0) )

Upvotes: 1

Zelazny7
Zelazny7

Reputation: 40628

Here is a simple solution using base R:

DF$total <- ave(DF$v, DF$c, cumsum(DF$c == ""), FUN = cumsum)
DF$total[DF$c == ""] <- 0

> DF
         c  v total
1          10     0
2  running 15    15
3  running 30    45
4  running  2    47
5           5     0
6          17     0
7          47     0
8  running 12    12
9  running 57    69
10         87     0

Upvotes: 2

Roland
Roland

Reputation: 132706

First we transform your data to a more appropriate data structure that can contain mixed data types:

m <- structure(c("", "running", "running", "running", "", "", "", 
                 "running", "running", "", "10", "15", "30", "2", "5", "17", "47", 
                 "12", "57", "87"), .Dim = c(10L, 2L), .Dimnames = list(NULL, 
                                                                        c("c", "v")))
DF <- as.data.frame(m, stringsAsFactors = FALSE)
DF[] <- lapply(DF, type.convert, as.is = TRUE)

Then we can do this easily with package data.table:

library(data.table)
setDT(DF)
DF[, total := cumsum(v), by = rleid(c)]
DF[c == "", total := 0]
#          c  v total
# 1:         10     0
# 2: running 15    15
# 3: running 30    45
# 4: running  2    47
# 5:          5     0
# 6:         17     0
# 7:         47     0
# 8: running 12    12
# 9: running 57    69
#10:         87     0

Upvotes: 2

Related Questions