Reputation: 21
I like to know how I can use dplyr mutate function when I don't know column names. Here is my example code;
library(dplyr)
w<-c(2,3,4)
x<-c(1,2,7)
y<-c(1,5,4)
z<-c(3,2,6)
df <- data.frame(w,x,y,z)
df %>% rowwise() %>% mutate(minimum = min(x,y,z))
Source: local data frame [3 x 5]
Groups: <by row>
# A tibble: 3 x 5
w x y z minimum
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 1 3 1
2 3 2 5 2 2
3 4 7 4 6 4
This code is finding minimum value in row-wise. Yes, "df %>% rowwise() %>% mutate(minimum = min(x,y,z))" works because I typed column names, x, y, z. But, let's assume that I have a really big data.frame with several hundred columns, and I don't know all of the column names. Or, I have multiple data sets of data.frame, and they have all different column names; I just want to find a minimum value from 10th column to 20th column in each row and in each data.frame.
In this example data.frame I provided above, let's assume that I don't know column names, but I just want to get minimum value from 2nd column to 4th column in each row. Of course, this doesn't work, because 'mutate' doesn't work with vector;
df %>% rowwise() %>% mutate(minimum=min(df[,2],df[,3], df[,4]))
Source: local data frame [3 x 5]
Groups: <by row>
# A tibble: 3 x 5
w x y z minimum
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 1 3 1
2 3 2 5 2 1
3 4 7 4 6 1
These two codes below also don't work.
df %>% rowwise() %>% mutate(average=min(colnames(df)[2], colnames(df)[3], colnames(df)[4]))
df %>% rowwise() %>% mutate(average=min(noquote(colnames(df)[2]), noquote(colnames(df)[3]), noquote(colnames(df)[4])))
I know that I can get minimum value by using apply or different method when I don't know column names. But, I like to know whether dplyr mutate function can be able to do that without known column names.
Thank you,
Upvotes: 1
Views: 1433
Reputation: 18661
With apply
:
library(dplyr)
library(purrr)
df %>%
mutate(minimum = apply(df[,2:4], 1, min))
or with pmap
:
df %>%
mutate(minimum = pmap(.[2:4], min))
Also with by_row
from purrrlyr
:
df %>%
purrrlyr::by_row(~min(.[2:4]), .collate = "rows", .to = "minimum")
Output:
# tibble [3 x 5]
w x y z minimum
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 1 3 1
2 3 2 5 2 2
3 4 7 4 6 4
Upvotes: 2
Reputation: 886938
A vectorized option would be pmin
. Convert the column names to symbols with syms
and evaluate (!!!
) to return the values of the columns on which pmin
is applied
library(dplyr)
df %>%
mutate(minimum = pmin(!!! rlang::syms(names(.)[2:4])))
# w x y z minimum
#1 2 1 1 3 1
#2 3 2 5 2 2
#3 4 7 4 6 4
Upvotes: 1
Reputation: 15062
Here is a tidyeval
approach along the lines of the suggestion from aosmith. If you don't know the column names, you can make a function that accepts the desired positions as inputs and finds the columns names itself. Here, rlang::syms()
takes the column names as strings and turns them into symbols, !!!
unquotes and splices the symbols into the function.
library(dplyr)
w<-c(2,3,4)
x<-c(1,2,7)
y<-c(1,5,4)
z<-c(3,2,6)
df <- data.frame(w,x,y,z)
rowwise_min <- function(df, min_cols){
cols <- df[, min_cols] %>% colnames %>% rlang::syms()
df %>%
rowwise %>%
mutate(minimum = min(!!!cols))
}
rowwise_min(df, 2:4)
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> w x y z minimum
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 1 1 3 1
#> 2 3 2 5 2 2
#> 3 4 7 4 6 4
rowwise_min(df, c(1, 3))
#> Source: local data frame [3 x 5]
#> Groups: <by row>
#>
#> # A tibble: 3 x 5
#> w x y z minimum
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 1 1 3 1
#> 2 3 2 5 2 3
#> 3 4 7 4 6 4
Created on 2018-09-04 by the reprex package (v0.2.0).
Upvotes: 0