Richard Summers
Richard Summers

Reputation: 143

Function in R, passing in variables

I am looking to run multiple ANOVAs in R, so I was hoping to write a function.

df = iris

run_anova <- function(var1,var2,df) {
  fit = aov(var1 ~ var1 , df)
  return(fit)
}

In the iris dataset, the column names are "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

Assuming that I want to use these columns in the equations, how do I pass them into the run_anova function? I have tried passing them in as strings

run_anova("Sepal.Width", "Petal.Length", df)

that doesn't work because this error appears: "In storage.mode(v) <- "double" :"

run_anova(Sepal.Width, Petal.Length, df)

When I just pass them in without the quotes, "not found". How can I pass these names of the df columns into the function?

Many thanks in advance for your help.

Upvotes: 0

Views: 433

Answers (2)

Maurits Evers
Maurits Evers

Reputation: 50738

An alternative is to use rlang's quasi-quotation syntax

df = iris

library(rlang)
run_anova <- function(var1, var2, df) {
    var1 <- parse_expr(quo_name(enquo(var1)))
    var2 <- parse_expr(quo_name(enquo(var2)))
    eval_tidy(expr(aov(!!var1 ~ !!var2, data = df)))
}

This allows you to do use both strings and unquoted expressions for var1 and var2:

run_anova("Sepal.Width", "Petal.Length", df)
run_anova(Sepal.Width, Petal.Length, df)

Both expressions return the same result.

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 270298

1) Use reformulate to create the formula. The do.call is needed to cause the Call: line in the output to appear nicely but if you don't care about that you can use the shorter version shown in (3).

run_anova <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  do.call("aov", list(fo, substitute(df)))
}

run_anova("Sepal.Width", "Petal.Length", iris)

giving

Call:
   aov(formula = Sepal.Width ~ Petal.Length, data = iris)    

Terms:
                Petal.Length Residuals
Sum of Squares      5.196047 23.110887
Deg. of Freedom            1       148

Residual standard error: 0.3951641
Estimated effects may be unbalanced

2) Although the use of eval is discouraged, an alternative which also gives nice output is:

run_anova2 <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  eval.parent(substitute(aov(fo, df)))
}

run_anova2("Sepal.Width", "Petal.Length", iris)

3) If you don't care about the Call line in the output being nice then this simpler code can be used:

run_anova3 <- function(var1, var2, df) {
  fo <- reformulate(var2, var1)
  aov(fo, df)
}

run_anova3("Sepal.Width", "Petal.Length", iris)

giving:

Call:
   aov(formula = fo, data = df)
...etc...

Upvotes: 1

Related Questions