How to filter a variable based on one criteria while pool other information at the same time?

Question

This data represents 3 people. I would like to filter out people who had an error of 5 at least once.

id  error
1   0
1   0
1   5
2   0
2   5
2   0
3   0
3   0
3   0

structure(list(id = structure(c(1, 1, 1, 2, 2, 2, 3, 3, 3), format.stata = "%9.0g"), 
        error = structure(c(0, 0, 5, 0, 5, 0, 0, 0, 0), format.stata = "%9.0g")), row.names = c(NA, 
    -9L), class = c("tbl_df", "tbl", "data.frame"))

If i use this code:

df %>% 
  group_by(id) %>% 
  filter(error > 0)

As shown below i would get only the occurrences of when there was an error, but i would not get the data before and after it for the same person.

id  error
1   5
2   5

However i would like to get this output as i am following people over time and would like to see what happened before and afterwards

Could someone guide me a bit please?

Karthik S · Accepted Answer

Does this work:

library(dplyr)
df %>% group_by(id) %>% filter(any(error == 5))
# A tibble: 6 x 2
# Groups:   id [2]
     id error
   
1     1     0
2     1     0
3     1     5
4     2     0
5     2     5
6     2     0

How to filter a variable based on one criteria while pool other information at the same time?

Answers (2)

Related Questions