kccu
kccu

Reputation: 283

Filtering multiple columns of data frame inside a loop in R

I want to use a loop to filter multiple columns of a data frame, removing rows where any of the given column values are in a particular list.

For instance:

> my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
> color_words = c("red", "orange", "yellow", "green", "blue")
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

Using the dplyr filter() function:

> my_df %>% filter(!word1 %in% color_words) %>% filter(!word2 %in% color_words)
  word1 word2 word3
1   one apple   red

My first attempt to perform this filtering in a loop was:

col_names <- c("word1","word2")
for(col in col_names){
    my_df <- my_df %>% filter(!col %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

I read about quoting and unquoting when using filter(), so I also tried:

for(col in col_names){
    col <- enquo(col)
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

and

for(col in col_names){
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

What is the correct way to go about doing this filtering via a loop?

Upvotes: 0

Views: 1589

Answers (2)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

use base

my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
color_words <-  paste0(c("red", "orange", "yellow", "green", "blue"), collapse = "|") 
fltr <- apply(my_df[1:2], 1, function(x) !any(grepl(color_words, x)))
my_df[fltr, ]
#>   word1 word2 word3
#> 1   one apple   red

Created on 2020-09-25 by the reprex package (v0.3.0)

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388817

You don't need a loop you can use filter with across to apply a function for multiple columns

library(dplyr)
my_df %>% filter(across(all_of(col_names), ~!. %in% color_words))

#  word1 word2 word3
#1   one apple   red

If you have an older version of dplyr, use filter_at :

my_df %>% filter_at(col_names, all_vars(!. %in% color_words))

Upvotes: 2

Related Questions