Reputation: 137
I would like to take a list (or filter) a set of similarly named columns in a dataframe, evaluate the value for those columns for each row, and then create a new column with the result.
Right now I am using case_when
but I am not aware of a way to provide some wildcard or list of defined columns.
I would like to take a list, or create a filter, of columns because I want to evaluate several columns in the dataframe, not just a handful. Having a long list of columns in case_when
seems not efficient when the column names are quite similar to each other.
# Dummy data
ignore1 <- c(1, 0, 0)
ignore2 <- c(1, 0, 1)
col1 <- c(0, 1, 0)
col2 <- c(0, 1, 1)
col3 <- c(0, 1, 0)
df <- data.frame(ignore1, ignore2, col1, col2, col3)
df %>%
mutate(evaluation = case_when(
col1 == 0| col1 == 0 | col1 == 0 ~ "Failed",
TRUE ~ "Passed"
)
)
This is the expected result:
ignore1 ignore2 col1 col2 col3 evaluation
1 1 1 0 0 0 Failed
2 0 0 1 1 1 Passed
3 0 1 0 1 0 Failed
where row 2 passes because col1
, col2
, col3
all have the value of 1
.
Upvotes: 1
Views: 141
Reputation: 887711
We can just use rowSums
to make this efficient
i1 <- startsWith(names(df), 'col')
c( "Failed", "Passed")[(rowSums(df[i1] == 1) == 3) + 1]
#[1] "Failed" "Passed" "Failed"
Or another base R
efficient option is Reduce
c("Failed", "Passed")[Reduce(`&`, df[i1]) +1]
#[1] "Failed" "Passed" "Failed"
NOTE: Both base R
solutions are compact and are very efficient
Or with &
library(dplyr)
df %>%
mutate(evaluation = c('Failed', 'Passed')[1 + (col1 & col2 & col3)])
# ignore1 ignore2 col1 col2 col3 evaluation
#1 1 1 0 0 0 Failed
#2 0 0 1 1 1 Passed
#3 0 1 0 1 0 Failed
Or we can have rowSums
within dplyr
df %>%
mutate(evaluation = c("Failed", "Passed")[(rowSums(.[i1] == 1) == 3) + 1])
NOTE: Both the solutions are very efficient and doesn't use any packages that are not really needed
Or if we need some packages, then use magrittr
with purrr
library(magrittr)
library(purrr)
df %>%
mutate(evaluation = select(., starts_with('col')) %>%
reduce(`&`) %>%
add(1) %>%
c("Failed", "Passed")[.])
# ignore1 ignore2 col1 col2 col3 evaluation
#1 1 1 0 0 0 Failed
#2 0 0 1 1 1 Passed
#3 0 1 0 1 0 Failed
NOTE: Here also, we are not looping over rows, so it should be efficient
Upvotes: 1
Reputation: 389215
As you want to perform a row-wise operation we can use pmap
variants
library(dplyr)
library(purrr)
df %>% mutate(result =c("Passed","Failed")[pmap_lgl(select(., starts_with('col')),
~any(c(...) == 0)) + 1])
# ignore1 ignore2 col1 col2 col3 result
#1 1 1 0 0 0 Failed
#2 0 0 1 1 1 Passed
#3 0 1 0 1 0 Failed
In base R, we can use apply
row-wise :
cols <- startsWith(names(df), 'col')
df$Result <- c("Passed", "Failed")[apply(df[cols] == 0, 1, any) + 1]
Upvotes: 0