Justin Leinaweaver
Justin Leinaweaver

Reputation: 13

Incorporating split in new dplyr function

I'm attempting to write a function in R using dplyr that will allow me to take a data set, split it by a factor, and then run a series of other, more complicated, user defined functions on those subsets.

My problem is that I'm not sure how to specify the argument in the function call so that split() recognizes and correctly interprets the input.

Toy data and simplified functions below. I'd like to be able to run the function once on grp1 and once on grp2.

Many thanks for any thoughts/assistance!

library(tidyverse)

# Create toy data
res <- tibble(
  x = runif(n = 25, 1, 100),
  g1 = sample(x = 1:3, size = 25, replace = T),
  g2 = sample(x = 1:3, size = 25, replace = T)
)

# Apply function after splitting by grouping variable 1
res %>%
  split(.$g1) %>%
  map_df(~ mean(.$x))

# Write function to allow different grouping variables (tried to follow the programming advice re dplyr functions even though I know split is a base function)
new_func1 <- function(data_in, grp) {

  grp <- enquo(grp)

  data_in %>%
    split(!!grp) %>%
    map_df(~ mean(x))
}

# All result in errors
new_func1(data_in = res, grp = g1)
new_func1(data_in = res, grp = ".$g1")
new_func1(data_in = res, grp = quote(.$g1))

# Try using quote
new_func2 <- function(data_in, grp) {

  data_in %>%
    split(grp) %>%
    map_df(~ mean(x))
}

# All result in errors
new_func2(data_in = res, grp = g1)
new_func2(data_in = res, grp = ".$g1")
new_func2(data_in = res, grp = quote(.$g1))

Upvotes: 1

Views: 364

Answers (1)

yutannihilation
yutannihilation

Reputation: 808

First, you cannot omit . in map_df(), map_df(~ mean(.$x)) is the correct one.

Second, split() is a base function, where you cannot use !!. !! is only effective if the function understands this notation. So, you can either

  1. unquote it inside such a function like pull().
  2. convert it to text.

For example:

new_func3 <- function(data_in, grp) {
  grp <- rlang::enquo(grp)

  data_in %>%
    split(pull(., !!grp)) %>%
    map_df(~ mean(.$x))
}

new_func4 <- function(data_in, grp) {
  grp <- rlang::enquo(grp)
  grp_chr <- rlang::quo_text(grp)

  data_in %>%
    split(.[[grp_chr]]) %>%
    map_df(~ mean(.$x))
}

Or, if you just want to pass grp as character, this is enough:

new_func5 <- function(data_in, grp_chr) {
  data_in %>%
    split(.[[grp_chr]]) %>%
    map_df(~ mean(.$x))
}

Upvotes: 3

Related Questions