cully_π
cully_π

Reputation: 47

Extracting outputs from for loops with dplyr pipes into dataframe in R

Having trouble figuring out how to do a series of t tests in a for loop and take the outputs each time the test is completed and append the results to a data frame. The goal is to run many t-tests at once and produce a data frame of all the results.

Here's it done with the mtcars dataset the slow way:

library(tidyverse)
library(rstatix)


# T-test to determine if there is a significant difference between mpg of 
# automatic vs manual transmissions (automatic=0, manual=1)
t1 <- mtcars %>% 
  t_test(mpg ~ am) %>% 
  mutate(var = "am") # add lable to merge by

# Calculate mean mpg of both groups
t1.1 <- mtcars %>% 
  group_by(am) %>% 
  summarize(Mean = mean(mpg, na.rm=TRUE)) %>% 
  pivot_wider(names_from = am, values_from = Mean) %>% # Bring to wide format to add to df
  mutate(var = "am") # add label to merge by

# T-test for vs (v-shape=0, straight line=1)
t2 <- mtcars %>% 
  t_test(mpg ~ vs) %>% 
  mutate(var = "vs") # add lable to merge by
# Calculate mean mpg of both groups
t2.1 <- mtcars %>% 
  group_by(vs) %>% 
  summarize(Mean = mean(mpg, na.rm=TRUE)) %>% 
  pivot_wider(names_from = vs, values_from = Mean) %>% # Bring to wide format to add to df
  mutate(var = "vs") # add label to merge by

# Merge dfs and rename
t_bind <- rbind(t1, t2)
t.1_bind <- rbind(t1.1, t2.1)
t.1_bind <- t.1_bind %>% rename("mean_0" = "0", "mean_1" = "1")
t_merge <- merge(t_bind, t.1_bind, by = "var")

But when I try to set this up for a loop, I'm lost. Seems like this should be fairly simple, just not thinking about it right

t_vars <- c("am", "vs")  # etc.

for (i in t_vars) {
  x1 <- mtcars %>% 
    t_test(mpg ~ i) %>% 
    mutate(var = colnames(mpg[[i]]))
  df <- append(x1)
}

# Error: Can't extract columns that don't exist.
# x Column `i` doesn't exist.

Thank you for the help!!

Upvotes: 2

Views: 129

Answers (2)

langtang
langtang

Reputation: 24722

something like this?

bind_rows(lapply(c("am", "vs"), function(i) {
  mtcars %>% 
    t_test(formula(paste0("mpg ~ ",i)),detailed=T) %>% 
    mutate(var = i)
}))

Output:

# A tibble: 2 × 16
  estimate estimate1 estimate2 .y.   group1 group2    n1    n2 statistic       p    df conf.low conf.high method alternative var  
     <dbl>     <dbl>     <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>       <chr>
1    -7.24      17.1      24.4 mpg   0      1         19    13     -3.77 0.00137  18.3    -11.3     -3.21 T-test two.sided   am   
2    -7.94      16.6      24.6 mpg   0      1         18    14     -4.67 0.00011  22.7    -11.5     -4.42 T-test two.sided   vs   

Upvotes: 3

Ben
Ben

Reputation: 30474

Here is an alternative using tidyverse nest_by after putting data into long form:

library(tidyverse)
library(rstatix)

mtcars %>%
  pivot_longer(cols = c(am, vs)) %>%
  nest_by(name) %>%
  transmute(model = list(t_test(data = data, formula = mpg ~ value, detailed = T))) %>%
  unnest(model)

Output

  name  estimate estimate1 estimate2 .y.   group1 group2    n1    n2 statistic       p    df conf.low conf.high method alternative
  <chr>    <dbl>     <dbl>     <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>      
1 am       -7.24      17.1      24.4 mpg   0      1         19    13     -3.77 0.00137  18.3    -11.3     -3.21 T-test two.sided  
2 vs       -7.94      16.6      24.6 mpg   0      1         18    14     -4.67 0.00011  22.7    -11.5     -4.42 T-test two.sided

Upvotes: 1

Related Questions