Reputation: 193
Given this data:
df=data.frame(
x1=c(2,0,0,NA,0,1,1,NA,0,1),
x2=c(3,2,NA,5,3,2,NA,NA,4,5),
x3=c(0,1,0,1,3,0,NA,NA,0,1),
x4=c(1,0,NA,3,0,0,NA,0,0,1),
x5=c(1,1,NA,1,3,4,NA,3,3,1))
I want to create an extra column min
for the rowwise minimal value of selected columns using dplyr. That's easy using the column names:
df <- df %>% rowwise() %>% mutate(min = min(x2,x5))
But I have a large df with varying column names so I need to match them from some string of values mycols
. Now other threads tell me to use select helper functions, but I must be missing something. Here's matches
:
mycols <- c("x2","x5")
df <- df %>% rowwise() %>%
mutate(min = min(select(matches(mycols))))
Error: is.string(match) is not TRUE
And one_of
:
mycols <- c("x2","x5")
df <- df %>%
rowwise() %>%
mutate(min = min(select(one_of(mycols))))
Error: no applicable method for 'select' applied to an object of class "c('integer', 'numeric')"
In addition: Warning message:
In one_of(c("x2", "x5")) : Unknown variables: `x2`, `x5`
What am I overlooking? Should select_
work? It doesn't in the following:
df <- df %>%
rowwise() %>%
mutate(min = min(select_(mycols)))
Error: no applicable method for 'select_' applied to an object of class "character"
And likewise:
df <- df %>%
rowwise() %>%
mutate(min = min(select_(matches(mycols))))
Error: is.string(match) is not TRUE
Upvotes: 1
Views: 4708
Reputation: 1350
This was a bit trickier. In case of SE evaluation you'd need to pass the operation as string.
mycols <- '(x2,x5)'
f <- paste0('min',mycols)
df %>% rowwise() %>% mutate_(min = f)
df
# A tibble: 10 × 6
# x1 x2 x3 x4 x5 min
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2 3 0 1 1 1
#2 0 2 1 0 1 1
#3 0 NA 0 NA NA NA
#4 NA 5 1 3 1 1
#5 0 3 3 0 3 3
#6 1 2 0 0 4 2
#7 1 NA NA NA NA NA
#8 NA NA NA 0 3 NA
#9 0 4 0 0 3 3
#10 1 5 1 1 1 1
Upvotes: 2
Reputation: 6542
Here's another solution a bit technical with the help of purrr
package from the tidyverse designed for functional programming.
Fist, matches
helpers from dplyr
takes a regex string as argument not a vector. It is a good way for you to find a regex that matches all your columns.
(in the code under you can use the dplyr
select helper that you wish)
Then, purrr
functions works great with dplyr
when you understand the underlying scheme of functionnal programming.
Solution to your problem :
df=data.frame(
x1=c(2,0,0,NA,0,1,1,NA,0,1),
x2=c(3,2,NA,5,3,2,NA,NA,4,5),
x3=c(0,1,0,1,3,0,NA,NA,0,1),
x4=c(1,0,NA,3,0,0,NA,0,0,1),
x5=c(1,1,NA,1,3,4,NA,3,3,1))
# regex to get only x2 and x5 column
mycols <- "x[25]"
library(dplyr)
df %>%
mutate(min_x2_x5 =
# select columns that you want in df
select(., matches(mycols)) %>%
# use pmap on this subset to get a vector of min from each row.
# dataframe is a list so pmap works on each element of the list that is to say each row
purrr::pmap_dbl(min)
)
#> x1 x2 x3 x4 x5 min_x2_x5
#> 1 2 3 0 1 1 1
#> 2 0 2 1 0 1 1
#> 3 0 NA 0 NA NA NA
#> 4 NA 5 1 3 1 1
#> 5 0 3 3 0 3 3
#> 6 1 2 0 0 4 2
#> 7 1 NA NA NA NA NA
#> 8 NA NA NA 0 3 NA
#> 9 0 4 0 0 3 3
#> 10 1 5 1 1 1 1
I won't explain further about purrr
here but it works fine in your case
Upvotes: 6