nimliug
nimliug

Reputation: 391

How to use a variable name in a formula instead of the column itself

I have data for which I would like to make a summary by group using the summary_by function (from the doBy package). I can't use the column names in the summary_by formula but variables I created before.
Below is the result I would like to achieve :

library(data.table)
library(doBy)

mtcars = data.table(mtcars)

doBy::summary_by(data = mtcars, mpg ~ gear + am, FUN = "mean")

output:

gear  am   mpg."mean"
3     0    16.10667
4     0    21.05000
4     1    26.27500
5     1    21.38000

Here is what I want to do :

library(data.table)
library(doBy)

mtcars = data.table(mtcars)

variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars

doBy::summary_by(data = mtcars, variable3 ~ variable1 + variable2 , FUN = "mean")

I tried with the functions get, assign, eval, mget but I don't find the solution.

Upvotes: 0

Views: 336

Answers (2)

nimliug
nimliug

Reputation: 391

Thanks @mnist it works !!

I just find 2 other ways :

library(data.table)
library(doBy)

mtcars = data.table(mtcars)

variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars
  • Summary_by solution with reformulate function :

    summary_by(data = mtcars, reformulate(
        termlabels = c(variable1, variable2),
        response = variable3)
    )
    
  • Datatable native way :

    mtcars[, mean(get(variable3)), by = mget(c(variable1, variable2))]
    

Upvotes: 1

mnist
mnist

Reputation: 6956

Just provide a string instead of a formula that relies on non-standard evaluation.

library(data.table)
library(doBy)

mtcars = data.table(mtcars)

variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars

doBy::summary_by(data = mtcars, 
                 # alternatively to sprintf(), use paste() oder glue()
                 as.formula(sprintf("%s ~ %s + %s", variable3, variable1, variable2)), 
                 FUN = "mean")

Upvotes: 2

Related Questions