Reputation: 1254
I have defined a function which will dynamically create new variables in a data frame. For this function the input is a string which is then pasted with other strings to create variable names that already exist in the data frame which are then compared using case_when within mutate. The output of the function is the data frame with the new variable appended to the end. I want to apply this function to a vector of inputs, and create multiple new columns in the data frame. I have used the iris data set to create a function very similar to what I am doing.
func <- function(x) {
a <- paste0("Sepal.", x)
b <- paste0("Petal.", x)
x <- iris %>%
mutate(
!!(paste0("Compare.", x)) :=
case_when(
iris[[a]] > iris[[b]] ~ "Sepal",
iris[[a]] < iris[[b]] ~ "Petal",
TRUE ~ "Equal"
)
)
return(x)
}
inputVector <- c("Length", "Width")
I want to apply this function to inputVector and have one data frame returned which would have the two new variables in it, preferably without loops. I'm looking to use something like
iris <- lapply(inputVector, func)
But this returns a list of two data frames. I also know that I could use <<-
in func
, but I would like to avoid this.
Upvotes: 0
Views: 1208
Reputation: 43344
A simpler option is to only generate the new columns in the function by substituting transmute
for mutate
, iterating and joining them into a data frame with map_dfc
, and then using bind_cols
to add them to the original:
library(tidyverse)
func <- function(x) {
a <- sym(paste0("Sepal.", x)) # these need to be quosures to refer to variables
b <- sym(paste0("Petal.", x))
iris %>% transmute(
!!paste0("Compare.", x) := case_when(
!!a > !!b ~ "Sepal", # unquote quosures
!!a < !!b ~ "Petal",
TRUE ~ "Equal"
)
)
}
inputVector <- c("Length", "Width")
iris %>% bind_cols(map_dfc(inputVector, func)) %>% head()
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Compare.Length
#> 1 5.1 3.5 1.4 0.2 setosa Sepal
#> 2 4.9 3.0 1.4 0.2 setosa Sepal
#> 3 4.7 3.2 1.3 0.2 setosa Sepal
#> 4 4.6 3.1 1.5 0.2 setosa Sepal
#> 5 5.0 3.6 1.4 0.2 setosa Sepal
#> 6 5.4 3.9 1.7 0.4 setosa Sepal
#> Compare.Width
#> 1 Sepal
#> 2 Sepal
#> 3 Sepal
#> 4 Sepal
#> 5 Sepal
#> 6 Sepal
A slightly more elegant approach is to structure the function to take a vector, moving all the iteration inside. You could use the approach above, or just generate the vectors and assemble them into a data frame once there are multiple:
func2 <- function(x) {
columns <- map_dfc(x, function(y){
a <- paste0("Sepal.", y)
b <- paste0("Petal.", y)
column <- list(case_when(
iris[[a]] > iris[[b]] ~ "Sepal", # base notation is simpler than quosures
iris[[a]] < iris[[b]] ~ "Petal",
TRUE ~ "Equal"
))
names(column) <- paste0("Compare.", y)
column
})
iris %>% bind_cols(columns)
}
func2(inputVector) %>% tail()
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 145 6.7 3.3 5.7 2.5 virginica
#> 146 6.7 3.0 5.2 2.3 virginica
#> 147 6.3 2.5 5.0 1.9 virginica
#> 148 6.5 3.0 5.2 2.0 virginica
#> 149 6.2 3.4 5.4 2.3 virginica
#> 150 5.9 3.0 5.1 1.8 virginica
#> Compare.Length Compare.Width
#> 145 Sepal Sepal
#> 146 Sepal Sepal
#> 147 Sepal Sepal
#> 148 Sepal Sepal
#> 149 Sepal Sepal
#> 150 Sepal Sepal
Upvotes: 1
Reputation: 5240
Your result is two data frames (as you indicated):
iris[[1]]
is the first dataframeiris[[2]]
is the second dataframe.You can use merge
to merge both data frames into one, like this:
comp.iris <- lapply(inputVector, func)
comp.iris <- merge(comp.iris[[1]], comp.iris[[2]], sort = FALSE)
Hope it helps.
Upvotes: 0
Reputation: 1254
Playing around just a little bit more I did find the an answer to this question. Instead of having func
output the data frame I changed it to output just the vector and then used do.call
and lapply
func <- function(x) {
a <- paste0("Sepal", x)
b <- paste0("Petal", x)
x <- iris %>%
transmute(
!!(paste0("Compare.", x)) :=
case_when(
a > b ~ "Sepal",
a < b ~ "Petal",
TRUE ~ "Equal"
)
)
return(x)
}
do.call(cbind, c(iris, lapply(inputVector, func)))
I am definitely open to alternative solutions, as I imagine that this is perhaps not the best.
Upvotes: 0
Reputation: 9081
Slight change in the way you want to structure the output -
func <- function(x) {
a <- paste0("Sepal", x)
b <- paste0("Petal", x)
x1 <- iris %>%
mutate(
!!(paste0("Compare.", x)) :=
case_when(
a > b ~ "Sepal",
a < b ~ "Petal",
TRUE ~ "Equal"
)
)
return(x1[[paste0('Compare.',x)]])
}
inputVector <- c("Length", "Width")
op <- iris
op[,paste0('Compare.',inputVector)] <- lapply(inputVector, func)
Output
> head(op)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Compare.Length Compare.Width
1 5.1 3.5 1.4 0.2 setosa Sepal Sepal
2 4.9 3.0 1.4 0.2 setosa Sepal Sepal
3 4.7 3.2 1.3 0.2 setosa Sepal Sepal
4 4.6 3.1 1.5 0.2 setosa Sepal Sepal
5 5.0 3.6 1.4 0.2 setosa Sepal Sepal
6 5.4 3.9 1.7 0.4 setosa Sepal Sepal
Upvotes: 1