Reputation: 2561
I would like to filter rows where at least one column, excluding P is bigger than P, using dplyr. Trying to figuring out a solution that filters on all columns.
Example
library(dplyr)
df <- tibble(P = c(2,4,5,6,1.4), B =
c(2.1,3,5.5,1.2, 2),
C = c(2.2, 3.8, 5.7, 5,
1.5))
Desired output
df <- filter(df, B > P | C > P)
df
One solution using apply, which I would like to avoid if possible:
filter(df, apply(df, 1, function(x) sum(x > x[1]) > 1))
Upvotes: 0
Views: 451
Reputation: 887951
Here is an option using tidyverse
where we make use of the map
and reduce
functions from purrr
to get a logical vector
to extract
(from magrittr
) the rows of the original dataset
library(tidyverse)
library(magrittr)
df %>%
select(-one_of("P")) %>%
map(~ .> df$P) %>%
reduce(`|`) %>%
extract(df, .,)
# A tibble: 3 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2.0 2.1 2.2
#2 5.0 5.5 5.7
#3 1.4 2.0 1.5
This can also be converted to a function using the devel version of dplyr
(soon to be released 0.6.0
) which introduced quosures
and unquote
for evaluation. The enquo
is almost similar to substitute
from base R
which takes the user input and convert it to quosure
, one_of
takes string arguments, so it can be converted to string with quo_name
funFilter <- function(dat, colToCompare){
colToCompare <- quo_name(enquo(colToCompare))
dat %>%
select(-one_of(colToCompare)) %>%
map(~ .> dat[[colToCompare]]) %>%
reduce(`|`) %>%
extract(dat, ., )
}
funFilter(df, P)#compare all other columns with P
# A tibble: 3 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2.0 2.1 2.2
#2 5.0 5.5 5.7
#3 1.4 2.0 1.5
funFilter(df, B) #compare all other columns with B
# A tibble: 4 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2 2.1 2.2
#2 4 3.0 3.8
#3 5 5.5 5.7
#4 6 1.2 5.0
We can also parse the expression
v1 <- setdiff(names(df), "P")
filter(df, !!rlang::parse_quosure(paste(v1, "P", sep=" > ", collapse=" | ")))
# A tibble: 3 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2.0 2.1 2.2
#2 5.0 5.5 5.7
#3 1.4 2.0 1.5
This can also be made into a function
funFilter2 <- function(dat, colToCompare){
colToCompare <- quo_name(enquo(colToCompare))
v1 <- setdiff(names(dat), colToCompare)
expr <- rlang::parse_quosure(paste(v1, colToCompare, sep= " > ", collapse= " | "))
dat %>%
filter(!!expr)
}
funFilter2(df, P)
# A tibble: 3 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2.0 2.1 2.2
#2 5.0 5.5 5.7
#3 1.4 2.0 1.5
funFilter2(df, B)
# A tibble: 4 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2 2.1 2.2
#2 4 3.0 3.8
#3 5 5.5 5.7
#4 6 1.2 5.0
Or another approach could be pmax
df %>%
filter(do.call(pmax, .) > P)
# A tibble: 3 × 3
# P B C
# <dbl> <dbl> <dbl>
#1 2.0 2.1 2.2
#2 5.0 5.5 5.7
#3 1.4 2.0 1.5
Upvotes: 2
Reputation: 18435
Without dplyr
...
df2 <- df[df$P!=apply(df,1,max),]
or with dplyr
...
df3 <- df %>% filter(P!=apply(df,1,max))
Upvotes: 2