matt_k
matt_k

Reputation: 4489

Filter data frame rows based on values in vector

What is the best way to filter rows from data frame when the values to be deleted are stored in a vector? In my case I have a column with dates and want to remove several dates.

I know how to delete rows corresponding to one day, using !=, e.g.:

m[m$date != "01/31/11", ]

To remove several dates, specified in a vector, I tried:

m[m$date != c("01/31/11", "01/30/11"), ]

However, this generates a warning message:

Warning message:
In `!=.default`(m$date, c("01/31/11", "01/30/11")) :
longer object length is not a multiple of shorter object length
Calls: [ ... [.data.frame -> Ops.dates -> NextMethod -> Ops.times -> NextMethod

What is the correct way to apply a filter based on multiple values?

Upvotes: 18

Views: 13794

Answers (4)

Ben G
Ben G

Reputation: 4348

In regards to some of the questions above, here is a tidyverse compliant solution. I used anti_join from dplyr to achieve the same effect:

library(tidyverse)

numbers <- tibble(numbers = c(1:10))
numbers_to_remove <- tibble(number = c(3, 4, 5))

numbers %>%
  anti_join(numbers_to_remove)

Upvotes: 2

Vova Naumov
Vova Naumov

Reputation: 51

cool way is to use Negate function to create new one:

`%ni%` <- Negate(`%in%`) 

than you can use it to find not intersected elements

Upvotes: 4

nzcoops
nzcoops

Reputation: 9380

I think for that you want:

m[!m$date %in% c("01/31/11","01/30/11"),]

Upvotes: 14

Chase
Chase

Reputation: 69251

nzcoops is spot on with his suggestion. I posed this question in the R Chat a while back and Paul Teetor suggested defining a new function:

`%notin%` <- function(x,y) !(x %in% y) 

Which can then be used as follows:

foo <- letters[1:6]

> foo[foo %notin% c("a", "c", "e")]
[1] "b" "d" "f"

Needless to say, this little gem is now in my R profile and gets used quite often.

Upvotes: 40

Related Questions