Reputation: 283
I want to use a loop to filter multiple columns of a data frame, removing rows where any of the given column values are in a particular list.
For instance:
> my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
> color_words = c("red", "orange", "yellow", "green", "blue")
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
Using the dplyr filter()
function:
> my_df %>% filter(!word1 %in% color_words) %>% filter(!word2 %in% color_words)
word1 word2 word3
1 one apple red
My first attempt to perform this filtering in a loop was:
col_names <- c("word1","word2")
for(col in col_names){
my_df <- my_df %>% filter(!col %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
I read about quoting and unquoting when using filter()
, so I also tried:
for(col in col_names){
col <- enquo(col)
my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
and
for(col in col_names){
my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
word1 word2 word3
1 one apple red
2 two orange orange
3 red banana yellow
4 blue pear green
What is the correct way to go about doing this filtering via a loop?
Upvotes: 0
Views: 1589
Reputation: 8880
use base
my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
color_words <- paste0(c("red", "orange", "yellow", "green", "blue"), collapse = "|")
fltr <- apply(my_df[1:2], 1, function(x) !any(grepl(color_words, x)))
my_df[fltr, ]
#> word1 word2 word3
#> 1 one apple red
Created on 2020-09-25 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 388817
You don't need a loop you can use filter
with across
to apply a function for multiple columns
library(dplyr)
my_df %>% filter(across(all_of(col_names), ~!. %in% color_words))
# word1 word2 word3
#1 one apple red
If you have an older version of dplyr
, use filter_at
:
my_df %>% filter_at(col_names, all_vars(!. %in% color_words))
Upvotes: 2