user1357079
user1357079

Reputation: 89

Output t.test results to a data frame in R

I have a data frame of values from individuals linked to groups. I want to identify those groups who have mean values greater than the mean value plus one standard deviation for the whole data set. To do this, I'm calculating the mean value and standard deviation for the entire data frame and then running pairwise t-tests to compare to each group mean. I'm running into trouble outputting the results.

> head(df)

  individual  group  value
1 11559638    75     0.371
2 11559641    75     0.367
3 11559648    75     0.410
4 11559650    75     0.417
5 11559652    75     0.440
6 11559654    75     0.395

> allvalues <- data.frame(mean=rep(mean(df$value), length(df$individual)), sd=rep(sd(df$value),  length(df$individual)))

> valueplus <- with(df, by(df, df$individual, function(x) t.test(allvalues$mean + allvalues$sd, df$value, data=x)))

> tmpplus

-------------------------------------------------------------------------- 
df$individuals: 10
NULL
-------------------------------------------------------------------------- 
df$individuals: 20
NULL
-------------------------------------------------------------------------- 
df$individuals: 21

    Welch Two Sample t-test

data:  allvalues$mean + allvalues$sd and df$value 
t = 84.5217, df = 4999, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 0.04676957 0.04899068 
sample estimates:
mean of x mean of y 
0.4719964 0.4241162 

How do I get the results into a data frame? I'd expect the output to look something like this:

      groups  t        df     p-value  mean.x    mean.y
    1 10      NULL     NULL   NULL     NULL      NULL
    2 20      NULL     NULL   NULL     NULL      NULL 
    3 21      84.5217  4999   2.2e-16  0.4719964 0.4241162

Upvotes: 5

Views: 11676

Answers (1)

coffeinjunky
coffeinjunky

Reputation: 11514

From a purely programming perspective, you are asking how to get the output of t.test into a data.frame. Try the following, using mtcars:

library(broom)
tidy(t.test(mtcars$mpg))
  estimate statistic      p.value parameter conf.low conf.high
1 20.09062  18.85693 1.526151e-18        31 17.91768  22.26357

Or for multiple groups:

library(dplyr)
mtcars %>% group_by(vs) %>% do(tidy(t.test(.$mpg)))
# A tibble: 2 x 9
# Groups:   vs [2]
     vs estimate statistic  p.value parameter conf.low conf.high method            alternative
  <dbl>    <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>             <chr>      
1     0     16.6      18.3 1.32e-12        17     14.7      18.5 One Sample t-test two.sided  
2     1     24.6      17.1 2.75e-10        13     21.5      27.7 One Sample t-test two.sided  

Needless to say, you'll need to adjust the code to fit your specific setting.

Upvotes: 10

Related Questions