Indrajeet Patil
Indrajeet Patil

Reputation: 4889

How to create paired formula for t-test using {rlang}?

I'm trying to create a unified interface for running a t-test using {rlang}, and I'd like to use the formula method for t.test(). While doing so, I can construct a formula using {rlang} for unpaired t-test, but can't figure out how to create the formula for paired test.

library(rlang)

two_sample_t_test <- function(data, x, y, paired, ...) {
  x <- ensym(x)
  y <- ensym(y)

  exec(
    t.test,
    formula = if (paired) {
      new_formula(quote(Pair(x, y)), 1)
    } else {
      new_formula(y, x)
    },
    data = data,
    ...
  )
}

# unpaired t-test -------------------------

two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#> 
#>  Two Sample t-test
#> 
#> data:  wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#>  0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        3.768895        2.411000

# paired t-test -------------------------

two_sample_t_test(mtcars, am, wt, paired = TRUE)

#> Error in model.frame.default(formula = .Primitive("quote")(Pair(x, y) ~ : invalid type (list) for variable 'Pair(x, y)'

Created on 2023-09-11 with reprex v2.0.2

Upvotes: 2

Views: 205

Answers (1)

TimTeaFan
TimTeaFan

Reputation: 18581

In {rlang} we can use expr() to capture the expression inside new_formula and !! to evaluate x and y early, which should give us the desired result (the output is much nicer than my old answer using eval(bquote(...)), see below):

library(rlang)

two_sample_t_test <- function(data, x, y, paired, ...) {

  x <- ensym(x)
  y <- ensym(y)
  
  exec(
    t.test,
    formula = if (paired) {
      new_formula(expr(Pair(!! x, !! y)), 1)
    } else {
      new_formula(y, x)
    },
    data = data,
    ...
  )
}

two_sample_t_test(mtcars, am, wt, paired = TRUE)
#> 
#>  Paired t-test
#> 
#> data:  Pair(am, wt)
#> t = -11.589, df = 31, p-value = 8.457e-13
#> alternative hypothesis: true mean difference is not equal to 0
#> 95 percent confidence interval:
#>  -3.305685 -2.316315
#> sample estimates:
#> mean difference 
#>          -2.811

two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#> 
#>  Two Sample t-test
#> 
#> data:  wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#>  0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        3.768895        2.411000

Old answer

I'd try to use eval(bquote(...)) instead of quote() alone, then evaluate x and y early by wrapping them in .() and use the data.frame supplied in the data arg as envir for eval.

library(rlang)

two_sample_t_test <- function(data, x, y, paired, ...) {

  x <- ensym(x)
  y <- ensym(y)
  
  exec(
    t.test,
    formula = if (paired) {
      new_formula(eval(bquote(Pair(.(x), .(y))), envir = data), 1)
    } else {
      new_formula(y, x)
    },
    data = data,
    ...
  )
}

two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#> 
#>  Two Sample t-test
#> 
#> data:  wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#>  0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        3.768895        2.411000

two_sample_t_test(mtcars, am, wt, paired = TRUE)
#> 
#>  Paired t-test
#> 
#> data:  structure(c(1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2.62, 2.875, 2.32, 3.215, 3.44, 3.46, 3.57, 3.19, 3.15, 3.44, 3.44, 4.07, 3.73, 3.78, 5.25, 5.424, 5.345, 2.2, 1.615, 1.835, 2.465, 3.52, 3.435, 3.84, 3.845, 1.935, 2.14, 1.513, 3.17, 2.77, 3.57, 2.78), dim = c(32L, 2L), dimnames = list(NULL, c("x", "y")), class = "Pair")
#> t = -11.589, df = 31, p-value = 8.457e-13
#> alternative hypothesis: true mean difference is not equal to 0
#> 95 percent confidence interval:
#>  -3.305685 -2.316315
#> sample estimates:
#> mean difference 
#>          -2.811

Created on 2023-09-11 with reprex v2.0.2

Upvotes: 1

Related Questions