Jack
Jack

Reputation: 857

How to filter a variable based on one criteria while pool other information at the same time?

This data represents 3 people. I would like to filter out people who had an error of 5 at least once.

id  error
1   0
1   0
1   5
2   0
2   5
2   0
3   0
3   0
3   0

structure(list(id = structure(c(1, 1, 1, 2, 2, 2, 3, 3, 3), format.stata = "%9.0g"), 
        error = structure(c(0, 0, 5, 0, 5, 0, 0, 0, 0), format.stata = "%9.0g")), row.names = c(NA, 
    -9L), class = c("tbl_df", "tbl", "data.frame"))

If i use this code:

df %>% 
  group_by(id) %>% 
  filter(error > 0)

As shown below i would get only the occurrences of when there was an error, but i would not get the data before and after it for the same person.

id  error
1   5
2   5

However i would like to get this output as i am following people over time and would like to see what happened before and afterwards

   id   error
    1   0
    1   0
    1   5
    2   0
    2   5
    2   0
 

Could someone guide me a bit please?

Upvotes: 1

Views: 39

Answers (2)

akrun
akrun

Reputation: 887691

We could use %in% to do the filter as well

library(dplyr)
df %>%
    group_by(id) %>% 
    filter(5 %in% error)

-output

# A tibble: 6 x 2
# Groups:   id [2]
#     id error
#  <dbl> <dbl>
#1     1     0
#2     1     0
#3     1     5
#4     2     0
#5     2     5
#6     2     0

Upvotes: 1

Karthik S
Karthik S

Reputation: 11596

Does this work:

library(dplyr)
df %>% group_by(id) %>% filter(any(error == 5))
# A tibble: 6 x 2
# Groups:   id [2]
     id error
  <dbl> <dbl>
1     1     0
2     1     0
3     1     5
4     2     0
5     2     5
6     2     0

Upvotes: 1

Related Questions