José Bustos
José Bustos

Reputation: 159

Counting consecutive values each three times in a data frame in R

I have this

df<-cbind(
t1=c(1,1,1),
t2=c(1,1,1),
t3=c(0,1,1),
t4=c(1,0,1),
t5=c(1,1,1),
t6=c(1,1,1),
t7=c(1,1,0),
t8=c(0,1,1),
t9=c(1,1,1))


> df
     t1 t2 t3 t4 t5 t6 t7 t8 t9
[1,]  1  1  0  1  1  1  1  0  1
[2,]  1  1  1  0  1  1  1  1  1
[3,]  1  1  1  1  1  1  0  1  1

and I need to count the "ones" in each row in t3, t6 and t9. Every time the counter reach 3 has to go back to zero and start over again.

In this case the results should be:

new_t3 = 0, 3, 3

new_t6 = 3, 2, 3

new_t9= 1, 3, 2

How can I count these consecutive "ones" values at t3, t6 and t9? I've looked at rle but I'm still having trouble with it!

Many thanks for any help :)

Upvotes: 0

Views: 169

Answers (2)

Evan
Evan

Reputation: 2038

Something like this could work (edited to fix counts ending with 0):

dat <- as.data.frame(df)
new_t3 <- c()
for(i in 1:3){
    if(dat[i,3] != 0){
        count <- rle(dat[i,1:3])
        new_t3 <- append(new_t3, count$length[count$values == 1])
    } else{
        new_t3 <- append(new_t3, 0)
    }
}

This loops through each row for the column t1 to t3 and uses the rle function to calculate the number of consecutive values. count$length[count$values == 1] accesses the consecutive count where the value equalled 1 in the object returned by rle. You'd have to do this for each of the column groups you're counting, e.g.:

new_t6 <- c()
for(i in 1:3){
    if(dat[i,6] != 0){
        count <- rle(dat[i,4:6])
        new_t6 <- append(new_t6, count$length[count$values == 1])
    } else{
        new_t6 <- append(new_t6, 0)
    }
}

Or somehow wrap the loop in a function or nested for loop to automate over a table. But it looks like that returns the values in your example. Note that for new_t9 this method returns 1 1 3 2 because there are two single 1 values (1 0 1) in the first row. You might have to do something to the count variable if you need to avoid that type of result (using unique or max perhaps).

Changing df to a dataframe object allowed rle to work, otherwise it couldn't access the values.

Upvotes: 1

digEmAll
digEmAll

Reputation: 57210

Here's a possible approach using a good old for-loop combined with apply :

aggregateRow <- function(row){
  result <- rep(NA,length(row) %/% 3)
  cumul <- 0
  for(i in 1:length(row)){
    cumul <- cumul + row[i]
    if(i %% 3 == 0){
      if(row[i] == 0)
        cumul = 0
      if(cumul > 3)
        cumul = cumul - 3
      result[i %/% 3] = cumul
    }
  }
  return(result)
}

res <- t(apply(df,1,aggregateRow))
row.names(res) <- paste0('new_t',c(3,6,9)) # just to give names to the rows
> res
       [,1] [,2] [,3]
new_t3    0    3    2
new_t6    3    2    2
new_t9    3    3    2

Upvotes: 1

Related Questions