b222
b222

Reputation: 976

Filter based on levels of column

help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
                   grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
                   pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))

Using the help dataset, I want to:

With the hopes of having a dataset that looks like so:

  id grade pass
   5     a  yes
   7     b  yes
  42     d  yes
  46     e  yes
  46     e  yes

I attempted to use...

help %>% group_by(id, grade, pass) %>% filter(pass == "yes" & pass == "no")

but even that doesn't work as it erases everything and outputs an empty df.

Upvotes: 1

Views: 296

Answers (4)

MKR
MKR

Reputation: 20095

Using base r a solution could be:

help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
    grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
    pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))

# Keep duplicate Id and grades. The trick is to find duplicate from
# from start and then from last
help2 <- help[duplicated((help[,1:2])) | duplicated(help[,1:2], fromLast = TRUE),]


    # Filter for the pass
   help2[help2$pass == "yes",]

#   id grade pass
#1   5     a  yes
#3   7     b  yes
#7  42     d  yes
#9  46     e  yes
#10 46     e  yes

Upvotes: 1

Onyambu
Onyambu

Reputation: 79318

 subset(help,!duplicated(help)&pass=="yes")
   id grade pass
1   5     a  yes
3   7     b  yes
7  42     d  yes
9  46     e  yes
11 50     w  yes

Upvotes: 1

www
www

Reputation: 39164

We can group_by based on id and grade and then filter when count number is larger than 1 and pass is yes.

library(dplyr)

help %>%
  group_by(id, grade) %>%
  filter(n() > 1, pass %in% "yes") %>%
  ungroup()
# # A tibble: 5 x 3
#      id grade pass 
#   <dbl> <fct> <fct>
# 1  5.00 a     yes  
# 2  7.00 b     yes  
# 3 42.0  d     yes  
# 4 46.0  e     yes  
# 5 46.0  e     yes 

Upvotes: 1

ifly6
ifly6

Reputation: 5331

So I load it in:

og_help <- data.frame(id = c(5, 5, 7, 7, 18, 18, 42, 42, 46, 46, 50, 51),
                   grade = c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e", "w", "z"),
                   pass = c("yes", "no", "yes", "no", "no", "no", "yes", "no", "yes", "yes", "yes", "no"))

Then I return a unique set of the rows:

help <- unique(og_help)

And subset only those which have the pass variable set to yes.

help <- help[ which(help$pass == "yes"), ]

This outputs the following:

   id grade pass
1   5     a  yes
3   7     b  yes
7  42     d  yes
9  46     e  yes
11 50     w  yes

Upvotes: 0

Related Questions