Almond
Almond

Reputation: 87

R_exclude rows with a column containing a value if multiple rows exist

I have a dataframe "test" as below. I want to exclude all the rows of that person, if this person has "apple" in the "fruit" column, using R language. I wrote:

filter(test, name != test$name[test$fruit=="apple"])

original "test" data frame

enter image description here

actual result

enter image description here

expected result

enter image description here

Any help is appreciated! Thanks!

Upvotes: 0

Views: 45

Answers (2)

Shubham Pujan
Shubham Pujan

Reputation: 220

> test
     name      fruit
1   kevin      apple
2   kevin       pear
3   kevin      peach
4    jack      apple
5    jack       pear
6    jack      peach
7    jack       kiwi
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

First, we find the all the 'name' which have 'apple' as a fruit.

 df=unique(test$name[test$fruit=="apple"])

> df
[1] kevin jack 
Levels: caleb jack justin kevin

Now we need to remove rows from rows from test where name is same as those in df, i.e 'kevin' or 'jack'.

test1= test[ (!(test$name %in% df)),]

> test1
     name      fruit
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

Ofcourse we can write this in a single line :

test2=test[(!(test$name %in% (unique(test$name[test$fruit=="apple"])))),]

> test2
     name      fruit
8   caleb grapefruit
9   caleb       kiwi
10  caleb       pear
11 justin  pineapple
12 justin      grape
13 justin watermelon
14 justin       kiwi

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388862

You can do this in multiple ways.

In base R :

subset(test, !ave(fruit == 'apple', name, FUN = any))

#   name     fruit
#4 Justin pineapple
#5 Justin     grape

Using dplyr

test %>% group_by(name) %>% filter(!any(fruit == 'apple'))

Or data.table

setDT(test)[, .SD[!any(fruit == 'apple')], name]

Another option in base R without grouping could be

subset(test, !name %in% unique(name[fruit == "apple"]))

data

test <- data.frame(name = c('Jack', 'Jack', 'Jack', 'Justin', 'Justin'), 
             fruit  =c('pineapple', 'apple', 'grape', 'pineapple', 'grape'))

Upvotes: 1

Related Questions