R_exclude rows with a column containing a value if multiple rows exist

Question

I have a dataframe "test" as below. I want to exclude all the rows of that person, if this person has "apple" in the "fruit" column, using R language. I wrote:

filter(test, name != test$name[test$fruit=="apple"])

original "test" data frame

actual result

expected result

Any help is appreciated! Thanks!

Shubham Pujan · Accepted Answer

> test
     name      fruit
1   kevin      apple
2   kevin       pear
3   kevin      peach
4    jack      apple
5    jack       pear
6    jack      peach
7    jack       kiwi
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

First, we find the all the 'name' which have 'apple' as a fruit.

 df=unique(test$name[test$fruit=="apple"])

> df
[1] kevin jack 
Levels: caleb jack justin kevin

Now we need to remove rows from rows from test where name is same as those in df, i.e 'kevin' or 'jack'.

test1= test[ (!(test$name %in% df)),]

> test1
     name      fruit
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

Ofcourse we can write this in a single line :

test2=test[(!(test$name %in% (unique(test$name[test$fruit=="apple"])))),]

> test2
     name      fruit
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

R_exclude rows with a column containing a value if multiple rows exist

Answers (2)

Related Questions