Reputation: 89
I have a data frame of values from individuals linked to groups. I want to identify those groups who have mean values greater than the mean value plus one standard deviation for the whole data set. To do this, I'm calculating the mean value and standard deviation for the entire data frame and then running pairwise t-tests to compare to each group mean. I'm running into trouble outputting the results.
> head(df)
individual group value
1 11559638 75 0.371
2 11559641 75 0.367
3 11559648 75 0.410
4 11559650 75 0.417
5 11559652 75 0.440
6 11559654 75 0.395
> allvalues <- data.frame(mean=rep(mean(df$value), length(df$individual)), sd=rep(sd(df$value), length(df$individual)))
> valueplus <- with(df, by(df, df$individual, function(x) t.test(allvalues$mean + allvalues$sd, df$value, data=x)))
> tmpplus
--------------------------------------------------------------------------
df$individuals: 10
NULL
--------------------------------------------------------------------------
df$individuals: 20
NULL
--------------------------------------------------------------------------
df$individuals: 21
Welch Two Sample t-test
data: allvalues$mean + allvalues$sd and df$value
t = 84.5217, df = 4999, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.04676957 0.04899068
sample estimates:
mean of x mean of y
0.4719964 0.4241162
How do I get the results into a data frame? I'd expect the output to look something like this:
groups t df p-value mean.x mean.y
1 10 NULL NULL NULL NULL NULL
2 20 NULL NULL NULL NULL NULL
3 21 84.5217 4999 2.2e-16 0.4719964 0.4241162
Upvotes: 5
Views: 11676
Reputation: 11514
From a purely programming perspective, you are asking how to get the output of t.test
into a data.frame
. Try the following, using mtcars
:
library(broom)
tidy(t.test(mtcars$mpg))
estimate statistic p.value parameter conf.low conf.high
1 20.09062 18.85693 1.526151e-18 31 17.91768 22.26357
Or for multiple groups:
library(dplyr)
mtcars %>% group_by(vs) %>% do(tidy(t.test(.$mpg)))
# A tibble: 2 x 9
# Groups: vs [2]
vs estimate statistic p.value parameter conf.low conf.high method alternative
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0 16.6 18.3 1.32e-12 17 14.7 18.5 One Sample t-test two.sided
2 1 24.6 17.1 2.75e-10 13 21.5 27.7 One Sample t-test two.sided
Needless to say, you'll need to adjust the code to fit your specific setting.
Upvotes: 10