R count number of rows between two rows with specific variables

Question

My goal is to count the number of rows between rows with specific values. My starting point is this:

  id=c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2),
  var1=c('A', 'B', 'E', 'N', 'J', 'B', 'J', 'A', 'C', 'B', 'L', 'M', 'B', 'C', 'F'),
  var2=c(1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 1 )
)

The goal is to count the number of rows grouped by id, between row with var1='B' and var2='1' and row with var1='B' and var2='2'. So for id=1 there are 3 such rows, and for id=2, there are 2 such rows. My target is this:

  id=c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2),
  var1=c('A', 'B', 'E', 'N', 'J', 'B', 'J', 'A', 'C', 'B', 'L', 'M', 'B', 'C', 'F'),
  var2=c(1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 1 ),
  num=c(NA, NA, 1, 2, 3, NA, NA, NA, NA, NA, 1, 2, NA, NA, NA)            
 )

What I have tried before: Count number of rows between two specific rows I'm still not sure what task the person here is trying to accomplish. I also tried this: Counting number of rows between rows This bit is counting not between 2 rows, but from top row on. I need to count between specified rows. I understand that this task can be accomplished in a variety of ways. Such as, I can just select only the rows that I need to count and discard anything else. Or I can count within the dataframe that I have. But I'm not sure how to do either of these.

AndS. · Accepted Answer

You could try something like this:

library(tidyverse)

my_df <- tibble(
  id=c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2),
  var1=c('A', 'B', 'E', 'N', 'J', 'B', 'J', 'A', 'C', 'B', 'L', 'M', 'B', 'C', 'F'),
  var2=c(1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 1 )
)

my_df %>%
  mutate(rw_nm = row_number()) %>%
  filter(var1 == "B" & var2 %in% c(1,2))%>%
  group_by(id) %>%
  summarise(count = (rw_nm[[2]] - rw_nm[[1]])-1)
#> # A tibble: 2 x 2
#>      id count
#>    
#> 1     1     3
#> 2     2     2

R count number of rows between two rows with specific variables

Answers (1)

Related Questions