JeffZheng
JeffZheng

Reputation: 1405

covert the results (p-value) of a t.test for many columns of a data frame into a data frame with the information of each column's name

After doing t.test on several columns (header name: q1,q2,q3) of a data frame, the results looked like this:

$q1  

Welch Two Sample t-test  

data:  i by d$group  
t = -0.76262, df = 17.323, p-value = 0.4559  
alternative hypothesis: true difference in means is not equal to 0  
95 percent confidence interval:  
 -1.2294678  0.5759458  
sample estimates:  
mean in group A mean in group B   
    -0.05443279      0.27232820   


$q2  

Welch Two Sample t-test  

data:  i by d$group  
t = -1.6467, df = 17.731, p-value = 0.1172  
alternative hypothesis: true difference in means is not equal to 0  
95 percent confidence interval:  
 -1.2881952  0.1568201  
sample estimates:  
mean in group A mean in group B   
     -0.3906697       0.1750179   


$q3  

Welch Two Sample t-test  

data:  i by d$group  
t = 0.52889, df = 13.016, p-value = 0.6058  
alternative hypothesis: true difference in means is not equal to 0  
95 percent confidence interval:  
 -0.7569843  1.2478547  
sample estimates:  
mean in group A mean in group B   
    0.253746354     0.008311147   

What I want to do is to get the individual p-value, and form a data frame or matrix like the following:

   (the column name) q1 q2 q3  
   (the p-value) 0.4559 0.1172 0.6058

I tried to saving the t.test results (list) as d_df_ttest, and then use
for loop like:

for(v in 1:length(d_df_ttest)) {   
print (d_df_ttest[[v]]$p.value)  
}

But i can only get:

-[1] 0.4559469  
-[1] 0.1172263  
-[1] 0.6057874 

Would you please help me to get a data frame with the original column name (q1,q2,q3) and the corresponding p-value?

Thanks a lot,

Jeff

Upvotes: 1

Views: 150

Answers (2)

akrun
akrun

Reputation: 886938

We could do this with summarise_at

library(dplyr)
d %>% 
  summarise_at(vars(matches("q\\d+")), funs(t.test(.~ group)$p.value))
#        q1        q2        q3
#1 0.4559469 0.1172263 0.6057874

Or with base R

sapply(d[1:3], function(x) t.test(x ~ d$group)$p.value)
#       q1        q2        q3 
# 0.4559469 0.1172263 0.6057874 

data

set.seed(123)  
d <- data.frame(
 q1 = rnorm(20),
 q2 = rnorm(20),
 q3 = rnorm(20),
 group = sample(c("A", "B"), size = 20, replace = TRUE)) 

Upvotes: 1

Rafael D&#237;az
Rafael D&#237;az

Reputation: 2289

Create an empty matrix, and then fill it with for

# Create data
set.seed(123)  # This makes sampling replicable
df <- data.frame(
  q1 = rnorm(20),
  q2 = rnorm(20),
  q3 = rnorm(20),
  group = sample(c("A", "B"), size = 20, replace = TRUE)
)

pval = matrix(NA, ncol = ncol(df)-1, nrow = 1, dimnames = list("p-value",colnames(df)[-4]))
for(i in 1:(ncol(df)-1)){   pval[,i] <- t.test(df[,i]~df$group)$p.value}
pval
              q1        q2        q3
p-value 0.4559469 0.1172263 0.6057874

Upvotes: 1

Related Questions