Manuel Frias
Manuel Frias

Reputation: 311

Create counter of consecutive runs of a certain value

I have data where consecutive runs of zero are separated by runs of non-zero values. I want to create a counter for the runs of zero in the column 'SOG'.

For the first sequence of 0 in SOG, set the counter in column Stops to 1. For the second run of zeros, set 'Stops' to 2, and so on.

SOG Stops
--- -----
4   0
4   0
0   1
0   1
0   1
3   0
4   0
5   0
0   2
0   2
1   0
2   0
0   3
0   3
0   3

Upvotes: 4

Views: 796

Answers (4)

Ronak Shah
Ronak Shah

Reputation: 389315

A one-liner with rle would be -

df <- data.frame(SOG = c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0))
df <- transform(df, Stops = with(rle(SOG == 0), rep(cumsum(values) * values, lengths)))
df

#   SOG Stops
#1    4     0
#2    4     0
#3    0     1
#4    0     1
#5    0     1
#6    3     0
#7    4     0
#8    5     0
#9    0     2
#10   0     2
#11   1     0
#12   2     0
#13   0     3
#14   0     3
#15   0     3

Upvotes: 1

Pat W.
Pat W.

Reputation: 1831

Using dplyr:

 library(dplyr)
 df <- df %>% mutate(Stops = ifelse(SOG == 0, yes = cumsum(c(0, diff(!SOG) > 0)), no = 0))
 df$Stops
 #[1] 0 1 1 1 0 0 0 2 2 0 0 3 3 3

EDIT: As an aside to those of us who are still beginners, many of the answers to this question make use of logicals (i.e. TRUE, FALSE). ! before a numeric variable like SOG tests whether the value is 0 and assigns TRUE if it is, and FALSE otherwise.

SOG
#[1] 4 0 0 0 3 4 5 0 0 1 2 0 0 0
!SOG
#[1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
#[12]  TRUE  TRUE  TRUE

diff() takes the difference between the value and the one before it. Note that there is one less element in this list than in SOG since the first element doesn't have a lag with which to compute a difference. When it comes to logicals, diff(!SOG) produces 1 for TRUE - FALSE = 1, FALSE - TRUE = -1, and 0 otherwise.

diff(SOG)
#[1] -4  0  0  3  1  1 -5  0  1  1 -2  0  0
diff(!SOG)
#[1]  1  0  0 -1  0  0  1  0 -1  0  1  0  0

So cumsum(diff(!SOG) > 0) just focuses on the TRUE - FALSE changes

cumsum(diff(!SOG) > 0)
#[1] 1 1 1 1 1 1 2 2 2 2 3 3 3

But since the list of differences is one element shorter, we can append an element:

cumsum(c(0, diff(!SOG) > 0))  #Or cumsum( c(0, diff(!SOG)) > 0 ) 
#[1] 0 1 1 1 1 1 1 2 2 2 2 3 3 3

Then either "multiply" that list by !SOG as in @akrun's answer or use the ifelse() command. If a particular element of SOG == 0, we use the corresponding element from cumsum(c(0, diff(!SOG) > 0)); if it isn't 0, we assign 0.

Upvotes: 2

akrun
akrun

Reputation: 887911

Try

 df$stops<- with(df, cumsum(c(0, diff(!SOG))>0)*!SOG)
 df$stops
 # [1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3

Upvotes: 3

Roland
Roland

Reputation: 132969

SOG <- c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0)
#run length encoding:
tmp <- rle(SOG)
#turn values into logicals
tmp$values <- tmp$values == 0
#cumulative sum of TRUE values
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values])
#inverse the run length encoding
inverse.rle(tmp)
#[1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3

Upvotes: 7

Related Questions