select only rows with duplicate id and specific value from another column in R

Question

I have the following data with ID and value:

id <- c("1103-5","1103-5","1104-2","1104-2","1104-4","1104-4","1106-2","1106-2","1106-3","1106-3","2294-1","2294-1","2294-2","2294-2","2294-2","2294-3","2294-3","2294-3","2294-4","2294-4","2294-5","2294-5","2294-5","2300-1","2300-1","2300-2","2300-2","2300-4","2300-4","2321-1","2321-1","2321-2","2321-2","2321-3","2321-3","2321-4","2321-4","2347-1","2347-1","2347-2","2347-2")

value <- c(6,3,6,3,6,3,6,3,6,3,3,6,9,3,6,9,3,6,3,6,9,3,6,9,6,9,6,9,6,9,3,9,3,9,3,9,3,9,6,9,6)

If you notice, there are multiple values for the same id. What I'd like to do is get the value that are only 3 and 6 only if the IDs are the same. for eg. ID "1103-5" has both 3 and 6, so it should be in the list, but not "2347-2"

I'm using R

One method I tried is the following, but it gives me everything with value 3 and 6.

d <- data.frame(id, value)
group36 <- d[d$value == 3 | d$value == 6,]

and

d %>% group_by(id) %>% filter(3 == value | 6 == value)

The output should be like this:

iod · Accepted Answer

d<-group_by(d,id)
filter(d,any(value==3),any(value==6))

This gives you all the IDs where there is both a value of 3 (somewhere) AND a value of 6 (somewhere). Mind you, your data contains some IDs with THREE values. In these cases, if both 3 and 6 are present, it will be included in the result.

If you want to exclude those lines that remain which done equal 3 or 6, add this:

filter(d,value==3 | value==6)

If you want to exclude IDs that also have 3 and 6 as values but also have OTHER values, use this:

filter(d,any(value==3),any(value==6),value==3 | value==6)

select only rows with duplicate id and specific value from another column in R

Answers (2)

Related Questions