joe
joe

Reputation: 331

How can I pass a column name as a function argument using dplyr and ggplot2?

I am trying to write a function that will spit out model diagnostic plots.

to_plot <- function(df, model, response_variable, indep_variable) {
  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(indep_variable) %>%
    summarize(actual_response = mean(response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes(indep_variable)) + 
    geom_line(aes(x = indep_variable, y = actual_response, colour = "actual")) + 
    geom_line(aes(x = indep_variable, y = predicted_response, colour = "predicted")) +
    ylab(label = 'Response')

}

When I run this over a dataset, dplyr throws an error that I don't understand:

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity')
to_plot(mtcars, fit, mpg, wt)

 Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `indep_variable` is unknown 

Based on some crude debugging, I found that the error happens in the group_by step, so it could be related to how I'm calling the columns in the function. Thanks!

Upvotes: 0

Views: 832

Answers (1)

joe
joe

Reputation: 331

This code seems to fix it. As the commenters above mention, variables passed in to the function must be wrapped in the "enquo" function and then unwrapped with the !!. Note the aes() function becomes aes_() when working with strings.

library(tidyverse)

to_plot <- function(df, model, response_variable, indep_variable) {
  response_variable <- enquo(response_variable)
  indep_variable <- enquo(indep_variable)

  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(!!indep_variable) %>%
    summarize(actual_response = mean(!!response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes_(indep_variable)) + 
    geom_line(aes_(x = indep_variable, y = quote(actual_response)), colour = "blue") + 
    geom_line(aes_(x = indep_variable, y = quote(predicted_response)), colour = "red") +
    ylab(label = 'Response')

  return(resp_plot)
}

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity'))
to_plot(mtcars, fit, mpg, wt)

Upvotes: 1

Related Questions