Reputation: 395
I have a panel data set and want to create a matrix similar to a correlation matrix but only with the differences of the t-test estimates as well as the t-statistic.
Using the toothgrowth data, I first subgroup supp ids according to their dose values and I want to calculate the t-statistics for all possible combination between the sub groups.
I want my t-test matrix to look as follows
VC_all VC_0.5 VC_1 VC_all VC_0.5 VC_1 OJ_all OJ_0.5 OJ_1
VC_all -4 ( -1.92 )
VC_0.5
VC_1
VC_all
VC_0.5
VC_1
OJ_all
OJ_0.5
OJ_1
as an example I filled one value with the following formula
t_test <- t.test(x = filter(ToothGrowth, supp== "VC")$len,
y = filter(ToothGrowth, supp== "OJ")$len, var.equal = TRUE)
Is there a faster way to this but calculate all t-stats for every single grouping?
df["VC_all","OJ_all"] <- paste(round(t_test$estimate[1] - t_test$estimate[2]),
"(", round(t_test$statistic,2), ")")
Upvotes: 0
Views: 495
Reputation:
You can use this
# generate data
df <- data.frame(matrix(rnorm(100*3), ncol= 3))
# name data
names(df) <- c("a", "b", "c")
# or to use for your data
df <- name_of_your_dataframe
# make a dataframe for the results
results <- data.frame(matrix(rep(NA, ncol(df)*ncol(df)), ncol= ncol(df)))
# name the results dataframe
names(results) <- names(df)
rownames(results) <- names(df)
# between which columns do we need to run t-tests?
to_estimate <- t(combn(names(df), 2))
# replace upper triangle of the matrix with the results
results[upper.tri(results)] <- apply(to_estimate, 1, function(to_estimate_i){
t_results <- t.test(df[ , to_estimate_i[1]], df[ , to_estimate_i[2]])
out <- paste0(round(t_results$estimate[1] - t_results$estimate[2], 2), " (", round(t_results$statistic, 2), ")")
})
# copy upper to lower
results[lower.tri(results)] <- results[upper.tri(results)]
All you need to do is to replace df
with the name of your dataframe
Upvotes: 1