Get p-value with two variables and multiple row names

Question

I wandered if you can help me in measuring the p-value from this simple data.frame. My data frame is called (my_data). By viewing it, you can see similar values I have that I am comparing:

my_data <- read.csv("densityleftOK.csv", stringsAsFactors = FALSE [c(1,2,3),]

      P1    P2   P3  P4  P5   T1  T2  T3  T4  T5  T6
A     1008 1425 869 1205 954  797 722 471 435 628 925
B      550  443 317  477 337  383  54 111  27 239 379
C      483  574 597  375 593  553 249 325 238 354 411

Thus, I would like to get a single pvalue for each row by comparing placebo vs treated samples. If you don't mind, I'd like to get also the standard deviation between either placebo (P) and treated (T).

I appreciate any help. Thanks

StupidWolf · Accepted Answer

You can try something like below, where you pivot the data into long format,group by the ids, introduce a grouping vector("P" or "T") and use tidy on t.test to wrap it up in a table format:

library(broom)
library(tidyr)
library(dplyr)
library(tibble)

data = read.table(text="P1    P2   P3  P4  P5   T1  T2  T3  T4  T5  T6
A     1008 1425 869 1205 954  797 722 471 435 628 925
B      550  443 317  477 337  383  54 111  27 239 379
C      483  574 597  375 593  553 249 325 238 354 411",header=TRUE,row.names=1)

res = data %>% 
rownames_to_column("id") %>% 
pivot_longer(-id) %>% 
mutate(grp=sub("[0-9]","",name)) %>% 
group_by(id) %>% 
do(tidy(t.test(value ~ grp,data=.))) %>%
select(c(id,estimate,estimate1,estimate2,statistic,p.value)) %>%
mutate(stderr = estimate/statistic)

# A tibble: 3 x 7
# Groups:   id [3]
  id    estimate estimate1 estimate2 statistic p.value stderr
                          
1 A         429.     1092.      663       3.40 0.00950  126. 
2 B         226.      425.      199.      2.89 0.0192    78.2
3 C         169.      524.      355       2.65 0.0266    64.0

If you don't use packages.. then it's a matter of using apply, and I guess easier to declare the groups up front:

grp = gsub("[0-9]","",colnames(data))

res = apply(data,1,function(i){
data.frame(t.test(i~grp)[c("statistic","p.value","stderr")])
})

res = do.call(rbind,res)
  statistic     p.value    stderr
A  3.395303 0.009498631 126.40994
B  2.890838 0.019173060  78.16650
C  2.646953 0.026608838  63.99812

Get p-value with two variables and multiple row names

Answers (1)

Related Questions