jamesguy0121
jamesguy0121

Reputation: 1254

Use an apply function with user defined function that adds variables to data frame

I have defined a function which will dynamically create new variables in a data frame. For this function the input is a string which is then pasted with other strings to create variable names that already exist in the data frame which are then compared using case_when within mutate. The output of the function is the data frame with the new variable appended to the end. I want to apply this function to a vector of inputs, and create multiple new columns in the data frame. I have used the iris data set to create a function very similar to what I am doing.

func <- function(x) {
  a <- paste0("Sepal.", x)
  b <- paste0("Petal.", x)
  x <- iris %>% 
    mutate(
      !!(paste0("Compare.", x)) :=
        case_when(
          iris[[a]] > iris[[b]] ~ "Sepal",
          iris[[a]] < iris[[b]] ~ "Petal",
          TRUE ~ "Equal"
      )
    )
  return(x)
}

inputVector <- c("Length", "Width")

I want to apply this function to inputVector and have one data frame returned which would have the two new variables in it, preferably without loops. I'm looking to use something like

iris <- lapply(inputVector, func)

But this returns a list of two data frames. I also know that I could use <<- in func, but I would like to avoid this.

Upvotes: 0

Views: 1208

Answers (4)

alistaire
alistaire

Reputation: 43344

A simpler option is to only generate the new columns in the function by substituting transmute for mutate, iterating and joining them into a data frame with map_dfc, and then using bind_cols to add them to the original:

library(tidyverse)

func <- function(x) {
  a <- sym(paste0("Sepal.", x))    # these need to be quosures to refer to variables
  b <- sym(paste0("Petal.", x))
  iris %>% transmute(
      !!paste0("Compare.", x) := case_when(
          !!a > !!b ~ "Sepal",    # unquote quosures
          !!a < !!b ~ "Petal",
          TRUE ~ "Equal"
      )
    )
}

inputVector <- c("Length", "Width")

iris %>% bind_cols(map_dfc(inputVector, func)) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Compare.Length
#> 1          5.1         3.5          1.4         0.2  setosa          Sepal
#> 2          4.9         3.0          1.4         0.2  setosa          Sepal
#> 3          4.7         3.2          1.3         0.2  setosa          Sepal
#> 4          4.6         3.1          1.5         0.2  setosa          Sepal
#> 5          5.0         3.6          1.4         0.2  setosa          Sepal
#> 6          5.4         3.9          1.7         0.4  setosa          Sepal
#>   Compare.Width
#> 1         Sepal
#> 2         Sepal
#> 3         Sepal
#> 4         Sepal
#> 5         Sepal
#> 6         Sepal

A slightly more elegant approach is to structure the function to take a vector, moving all the iteration inside. You could use the approach above, or just generate the vectors and assemble them into a data frame once there are multiple:

func2 <- function(x) {
  columns <- map_dfc(x, function(y){
    a <- paste0("Sepal.", y)
    b <- paste0("Petal.", y)
    column <- list(case_when(
      iris[[a]] > iris[[b]] ~ "Sepal",    # base notation is simpler than quosures
      iris[[a]] < iris[[b]] ~ "Petal",
      TRUE ~ "Equal"
    ))
    names(column) <- paste0("Compare.", y)
    column
  })
  iris %>% bind_cols(columns)
}

func2(inputVector) %>% tail()
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 145          6.7         3.3          5.7         2.5 virginica
#> 146          6.7         3.0          5.2         2.3 virginica
#> 147          6.3         2.5          5.0         1.9 virginica
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#>     Compare.Length Compare.Width
#> 145          Sepal         Sepal
#> 146          Sepal         Sepal
#> 147          Sepal         Sepal
#> 148          Sepal         Sepal
#> 149          Sepal         Sepal
#> 150          Sepal         Sepal

Upvotes: 1

Taher A. Ghaleb
Taher A. Ghaleb

Reputation: 5240

Your result is two data frames (as you indicated):

  • iris[[1]] is the first dataframe
  • iris[[2]] is the second dataframe.

You can use merge to merge both data frames into one, like this:

comp.iris <- lapply(inputVector, func) 
comp.iris <- merge(comp.iris[[1]], comp.iris[[2]], sort = FALSE)

Hope it helps.

Upvotes: 0

jamesguy0121
jamesguy0121

Reputation: 1254

Playing around just a little bit more I did find the an answer to this question. Instead of having func output the data frame I changed it to output just the vector and then used do.call and lapply

func <- function(x) {
  a <- paste0("Sepal", x)
  b <- paste0("Petal", x)
  x <- iris %>% 
    transmute(
      !!(paste0("Compare.", x)) :=
        case_when(
          a > b ~ "Sepal",
          a < b ~ "Petal",
          TRUE ~ "Equal"
      )
    )
  return(x)
}

do.call(cbind, c(iris, lapply(inputVector, func)))

I am definitely open to alternative solutions, as I imagine that this is perhaps not the best.

Upvotes: 0

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

Slight change in the way you want to structure the output -

func <- function(x) {
  a <- paste0("Sepal", x)
  b <- paste0("Petal", x)
  x1 <- iris %>% 
    mutate(
      !!(paste0("Compare.", x)) :=
        case_when(
          a > b ~ "Sepal",
          a < b ~ "Petal",
          TRUE ~ "Equal"
        )
    )
  return(x1[[paste0('Compare.',x)]])
}

inputVector <- c("Length", "Width")
op <- iris
op[,paste0('Compare.',inputVector)] <- lapply(inputVector, func)

Output

> head(op)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Compare.Length Compare.Width
1          5.1         3.5          1.4         0.2  setosa          Sepal         Sepal
2          4.9         3.0          1.4         0.2  setosa          Sepal         Sepal
3          4.7         3.2          1.3         0.2  setosa          Sepal         Sepal
4          4.6         3.1          1.5         0.2  setosa          Sepal         Sepal
5          5.0         3.6          1.4         0.2  setosa          Sepal         Sepal
6          5.4         3.9          1.7         0.4  setosa          Sepal         Sepal

Upvotes: 1

Related Questions