Reputation: 136
I have this data:
# A tibble: 20 x 6
ID style param1 param2 param3 param4
<dbl> <chr> <chr> <chr> <chr> <chr>
1 1 ar R78 NA NA NA
2 2 bg NA NA NA NA
3 3 bh NA NA NA NA
4 4 ar NA R78 NA NA
5 5 bg NA NA NA NA
6 6 bh NA NA NA NA
7 7 ar R78 NA NA NA
8 8 bg NA NA R78 NA
9 9 bh NA NA NA NA
10 10 ar NA R78 NA NA
11 11 bg NA NA NA NA
12 12 bh NA NA R78 NA
13 13 ar NA NA NA NA
14 14 bg R78 NA NA NA
15 15 bh NA NA NA NA
16 16 ar NA NA NA NA
17 17 bg NA NA NA NA
18 18 bh R78 NA NA NA
19 19 ar NA NA NA R78
20 20 bg NA NA NA NA
I want to use dplyr::filter
to select rows when R78 is in the column param1, param2, param3 or param4
I try:
data %>%
filter(across(param1:param4) == "R78")
which returns me:
# A tibble: 4 x 6
ID style param1 param2 param3 param4
<dbl> <chr> <chr> <chr> <chr> <chr>
1 1 ar R78 NA NA NA
2 7 ar R78 NA NA NA
3 14 bg R78 NA NA NA
4 18 bh R78 NA NA NA
This is the same as when i do data %>% filter(param1 == "R78")
...
Maybe i misuse the "across" function. I've tried with multiples "|" but never work :/
What i expect to my code is it must return me a tibble with the row 1, 4, 7, 10, 12, 14; 18 and 19 only :/
Thnaks to you !
Upvotes: 2
Views: 67
Reputation: 9247
Just my two cents here: this is another possible solution that follows the indications on dplyr's documentation when talking about the filter
function. It says:
Previously,
filter()
was paired with theall_vars()
andany_vars()
helpers. Now,across()
is equivalent toall_vars()
, and there’s no direct replacement forany_vars()
. However you can make a simple helper yourself.
Here I create the helper function rowAny
which returns a logical vector according to which the condition x == "R78"
is satisfied or not, then I apply it along the variables specified by across
.
rowAny <- function(x) {rowSums(x == "R78", na.rm = TRUE) > 0}
df %>% filter(rowAny(across(param1:param4)))
# A tibble: 9 x 6
# ID style param1 param2 param3 param4
# <int> <chr> <chr> <chr> <chr> <chr>
# 1 1 ar R78 NA NA NA
# 2 4 ar NA R78 NA NA
# 3 7 ar R78 NA NA NA
# 4 8 bg NA NA R78 NA
# 5 10 ar NA R78 NA NA
# 6 12 bh NA NA R78 NA
# 7 14 bg R78 NA NA NA
# 8 18 bh R78 NA NA NA
# 9 19 ar NA NA NA R78
Upvotes: 2
Reputation: 3923
There is of course the old fashioned uncomplicated but wordy solution of just expressing all the "ors".
mydf %>% filter(param1 == "R78" | param2 == "R78" | param3 == "R78" | param4 == "R78" )
Upvotes: 1
Reputation: 388817
across
works column-wise. In such cases I think it is better to use filter_at
:
library(dplyr)
df %>% filter_at(vars(param1:param4), any_vars(. == 'R78'))
# ID style param1 param2 param3 param4
#1 1 ar R78 <NA> <NA> <NA>
#4 4 ar <NA> R78 <NA> <NA>
#7 7 ar R78 <NA> <NA> <NA>
#8 8 bg <NA> <NA> R78 <NA>
#10 10 ar <NA> R78 <NA> <NA>
#12 12 bh <NA> <NA> R78 <NA>
#14 14 bg R78 <NA> <NA> <NA>
#18 18 bh R78 <NA> <NA> <NA>
#19 19 ar <NA> <NA> <NA> R78
A hack to make across
work is to use Reduce
:
df %>% filter(Reduce(`|`, across(param1:param4, ~. == 'R78')))
In base R, you can use rowSums
:
cols <- paste0('param', 1:4)
df[rowSums(df[cols] == 'R78', na.rm = TRUE) > 0, ]
Upvotes: 4