CostelloatKaro
CostelloatKaro

Reputation: 33

Make a pipeable function

I'm stuck on a small issue I've been trying to overcome regarding creating a custom function in R, using magrittr pipes. Basically I've been trying to learn how to make functions which work when passed through a pipe. While the function works to summarize the data of the original dataset it won't accept modifications done through a previous command. Example down below:

TestData <- runif(1000, 1, 100)
TestID <- 1:1000
data01 <- data.frame(TestID, TestData) # Generate data to test the command on

custom_summary_cont2 <- function(DAT, var) {
  
  DAT %>%
   summarise(
     mean = mean(var),
     median = median(var),
     sd = sd(var),
     quant25 = unname(quantile(var, probs = 0.25)),
     quant75 = unname(quantile(var, probs = 0.75)),
     min = min(var),
     max = max(var)
    )
} # The custom function

Now running this code either as:

summary <- custom_summary_cont2(data01, TestData)

or

summary <- data01 %>%
custom_summary_cont2(TestData)

both produce the results I'm interested in, however the complication occurs when I try to pass the custom function when I've applied a previous function. For example:

summary <- data01 %>%
filter(TestData >50) %>%
custom_summary_cont2(TestData)

Now this code returns the same result as the code as if I did not have the filter function, how would I edit the function to make it use the results from filter command?

P.S if this is a really stupid question I'd love a recommendation for a good book that goes over these processes.

Upvotes: 3

Views: 455

Answers (2)

Kra.P
Kra.P

Reputation: 15143

Instead of var, {{var}} will give you a result.

custom_summary_cont2 <- function(DAT, var) {
  
  DAT %>%
    summarise(
      mean = mean({{var}}),
      median = median({{var}}),
      sd = sd({{var}}),
      quant25 = unname(quantile({{var}}, probs = 0.25)),
      quant75 = unname(quantile({{var}}, probs = 0.75)),
      min = min({{var}}),
      max = max({{var}})
    )
}

data01 %>%
  filter(TestData >50) %>%
  custom_summary_cont2(TestData)

      mean   median       sd  quant25  quant75      min      max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935

  data01 %>%
    filter(TestData >50) %>%
    summarise(
      mean = mean(TestData),
      median = median(TestData),
      sd = sd(TestData),
      quant25 = unname(quantile(TestData, probs = 0.25)),
      quant75 = unname(quantile(TestData, probs = 0.75)),
      min = min(TestData),
      max = max(TestData)
    )
      mean   median       sd  quant25  quant75      min      max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935

Upvotes: 2

Robin Gertenbach
Robin Gertenbach

Reputation: 10806

This is not related to piping.

You are passing the TestData vector into your function, not the column name.

To refer to the column name you need to force non-standard evaluation.

You can do this by wrapping the column reference in {{ var }}, i.e.:

custom_summary_cont2 <- function(DAT, var) {
  DAT %>%
    summarise(
      mean = mean({{var}}),
      median = median({{var}}),
      sd = sd({{var}}),
      quant25 = unname(quantile({{var}}, probs = 0.25)),
      quant75 = unname(quantile({{var}}, probs = 0.75)),
      min = min({{var}}),
      max = max({{var}})
    )
}

Upvotes: 3

Related Questions