Reputation: 47
New to posting to Stack so apologies for any issues.
I'm learning to get more comfortable in R and currently looking at using broom/purr to run multiple stat tests at one time. An example of my current data looks like this:
Subject | PreScoreTestA | PostScoreTestA | PreScoreTestB | PostScoreTestB | PreScoreTestC | PostScoreTestC |
---|---|---|---|---|---|---|
1 | 30 | 40 | 6 | 8 | 12 | 10 |
2 | 15 | 12 | 9 | 13 | 7 | 7 |
3 | 20 | 22 | 11 | 12 | 9 | 10 |
But over many subjects and more tests. I want to do a dependent t-test to see scores changed over the course of a training program, but don't want to run a test for each score.
I've seen a couple examples of people using group by, nest, and map to run multiple t-tests, but their data was in a longer format
Is there a way to achieve the same goal while in a wide format? Or will I need to use pivot_longer to change the data.
Thanks in advance!
ETA had an edit here but was giving incorrect results and so have removed Still looking for some help on the arguments and same length
ETA Version 2
I did find a workaround using pairwise.t.test (code below). It gives the same p-values as doing t.test across individual assessments. I'm curious why it'd be working for pairwise.t.test but not t.test. Please let me know if anyone was any ideas!
results <- testb %>%
pivot_longer(-Subject,
names_to = c("time", "test"), values_to = "score",
names_pattern = "(Pre|Post)(.*)") %>%
group_by(test) %>%
nest() %>%
mutate(ttests = map(.x=data, ~tidy(pairwise.t.test(.x$score, .x$time, paired = TRUE, p.adjust.method = "none")))) %>%
unnest(ttests)
Upvotes: 0
Views: 1015
Reputation: 79112
Here is a try without pivoting into long format: This again was finished with the help of the incredible akrun! See here: How to apply t.test() to multiple pairs of columns after mutate across:
df %>%
summarise(across(starts_with('PreScore'), ~ t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))$p.value,
.names = "{.col}_TTest"))
PreScoreTestA_TTest PreScoreTestB_TTest PreScoreTestC_TTest
1 0.767827 0.330604 0.8604162
Upvotes: 2
Reputation: 171
Yes, some pivoting is needed. Asssuming you have no directional hypotheses and you want to do a pre-post assessment for each test, this might be what you are looking for:
df <- as.data.frame(rbind(c(1, 30, 40, 6, 8, 12, 10),
c(2, 15, 12, 9, 13, 7, 7),
c(3, 20, 22, 11, 12, 9, 10)))
names(df) <- c("Subject",
"PrePushup", "PostPushup",
"PreRun", "PostRun",
"PreJump", "PostJump")
df %>%
pivot_longer(-Subject,
names_to = c("time", "test"), values_to = "score",
names_pattern = "(Pre|Post)(.*)") %>%
group_by(test) %>%
nest() %>%
mutate(t_tests = map(data, ~t.test(score ~ time, data = .x, paired = TRUE))) %>%
pull(t_tests) %>%
purrr::set_names(c("Pushup", "Run", "Jump"))
$Pushup
Paired t-test
data: score by time
t = 0.79241, df = 2, p-value = 0.5112
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-13.28958 19.28958
sample estimates:
mean of the differences
3
$Run
Paired t-test
data: score by time
t = 2.6458, df = 2, p-value = 0.1181
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.461250 6.127916
sample estimates:
mean of the differences
2.333333
$Jump
Paired t-test
data: score by time
t = -0.37796, df = 2, p-value = 0.7418
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.127916 3.461250
sample estimates:
mean of the differences
-0.3333333
Upvotes: 3