Aite97
Aite97

Reputation: 165

How to count observations between two rows based on condition in R?

I am trying to create a variable for a data frame in which I count the number of observations between two observations which meet a criteria. Here it is counting the number of times since last win in a game.

Say I have a dataframe like this:

df <- data.frame(player = c(10,10,10,10,10,10,10,10,10,10,10),win = c(1,0,0,0,1,1,0,1,0,0,1))

I want to create a new variable that counts the number of games it has been since the player has won. Summarized in a vector, the result should be (setting a Not Applicable for the first observation):

c(NA,0,1,2,3,0,0,1,0,1,2)

I want to be able to do this easily and create it as a variable in the data.frame using dplyr (or any other suitable approach)

Upvotes: 0

Views: 541

Answers (2)

Nicol&#225;s Velasquez
Nicol&#225;s Velasquez

Reputation: 5898

With {tidyverse}, try:

library(tidyverse)

df <- data.frame(player = c(10,10,10,10,10,10,10,10,10,10,10),
                 win = c(1,0,0,0,1,1,0,1,0,0,1))

df %>% 
  group_by(player, group = cumsum(win != lag(win, default = first(win)))) %>%
  mutate(counter = row_number(),
         counter = if_else(win == 1, true = 0L, false = counter)) %>% 
  ungroup() %>% 
  group_by(player) %>% 
  mutate(counter = if_else(row_number() == 1, true = NA_integer_, false = counter)) %>% 
  ungroup() %>% 
  select(-group)

  player   win counter
    <dbl> <dbl>   <int>
 1     10     1      NA
 2     10     0       1
 3     10     0       2
 4     10     0       3
 5     10     1       0
 6     10     1       0
 7     10     0       1
 8     10     1       0
 9     10     0       1
10     10     0       2
11     10     1       0

Upvotes: 1

Albin
Albin

Reputation: 902

I am not quite sure why the first value should be NA. Because the elapsed time is 0 since the last "win" and not NA.

For purely logical reasons, I would take the following approach:

seq = with(df, ave(win, cumsum(win == 1), FUN = seq_along)-1)

So you get the past cummulated sum games since the last win as follows:

c(0,1,2,3,0,0,1,0,1,2,0)

But if you still aim for your described result with a little data handling you can achieve it with this:

append(NA, seq[1:length(seq)-1])

It is not nice, but it works ;)

Upvotes: 1

Related Questions