B_slash_
B_slash_

Reputation: 383

How to do numerous t.test withing my dataframe (multiple groups/outcomes, multiple quantitative variable)?

By running multiple t.test within R (multiple group compared according to multiple quantitative variable), I would like to get something like this within R:

excel sheet

I tried and modified the solution provided here: dplyr summarise multiple columns using t.test

library(dplyr)
library(tidyr)
data(mtcars)


vars_to_test <- c("disp","hp","drat","wt","qsec")
iv <- c("vs", "am")


mtcars %>%
  summarise_each_(
    funs_( 
      sprintf("stats::t.test(.[%s == 0], .[%s == 1])$p.value",iv,iv)
    ), 
    vars = vars_to_test)

Here is the output:

     disp_$..1      hp_$..1  drat_$..1      wt_$..1    qsec_$..1    disp_$..2   hp_$..2    drat_$..2     wt_$..2 qsec_$..2
1 2.476526e-06 1.819806e-06 0.01285342 0.0007281397 3.522404e-06 0.0002300413 0.2209796 5.266742e-06 6.27202e-06 0.2093498

I am facing multiple issues:

Please use summarise_if(), summarise_at(), or summarise_all() instead: 

  - To map `funs` over all variables, use summarise_all()
  - To map `funs` over a selection of variables, use summarise_at()
This warning is displayed once per session. 
2: funs_() is deprecated. 
Please use list() instead

Thanks a lot for your help!

Upvotes: 3

Views: 220

Answers (2)

A. Suliman
A. Suliman

Reputation: 13125

Apply summarise_at for each varibale in iv using map_dfr

vars_to_test <- c("disp","hp","drat","wt","qsec")
iv <- c("vs", "am")
#map_dfr use these names as id
names(iv) <- iv

library(dplyr)
library(purrr)
map_dfr(iv,  function(x) mtcars %>%
  summarise_at(vars_to_test,
    list( 
       #!! and parse_expr will transfer the string "output of sprintf" into epression
     ~!!(parse_expr(sprintf("stats::t.test(.[%s == 0], .[%s == 1])$p.value", x, x)))
    )), .id = "group")

  group         disp           hp         drat           wt         qsec
1    vs 2.476526e-06 1.819806e-06 1.285342e-02 7.281397e-04 3.522404e-06
2    am 2.300413e-04 2.209796e-01 5.266742e-06 6.272020e-06 2.093498e-01

Both summarise_each_ and funs_ are deprecated.

Upvotes: 3

AntoniosK
AntoniosK

Reputation: 16121

library(tidyverse)

vars_to_test <- c("disp","hp","drat","wt","qsec")
iv <- c("vs", "am")

expand.grid(vars_to_test, iv, stringsAsFactors = F) %>%              # create pairs of variables
  rowwise() %>%                                                      # for each pair
  mutate(p_val = t.test(mtcars[,Var1] ~ mtcars[,Var2])$p.value) %>%  # get p value from t.test
  spread(Var1, p_val)                                                # reshape output

# # A tibble: 2 x 6
#   Var2        disp       drat         hp       qsec         wt
#   <chr>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
# 1 am    0.000230   0.00000527 0.221      0.209      0.00000627
# 2 vs    0.00000248 0.0129     0.00000182 0.00000352 0.000728  

Upvotes: 4

Related Questions