Simon H
Simon H

Reputation: 75

T test within data frame in r

I'd like to perform an independant t.test in a data frame

    eyecolor    suncream    moles
1   blue    x   10
2   blue    x   9
3   blue    x   6
4   blue    y   15
5   blue    y   7
6   blue    y   3
7   brown   x   9
8   brown   x   6
9   brown   x   4
10  brown   y   1
11  brown   y   2
12  brown   y   1

That means 1. selecting according to eyecolor and 2. peform t.test for nr moles in suncream x vs y. I'm able to select with dplyr for mean, e.g.:

df %>% group_by(eyecolor, suncream) %>% summarize(moles.mean = mean(moles))

Just to make it clear, I would like to get a p-value comparing suncream x and y for every eycolor

Upvotes: 0

Views: 1220

Answers (2)

Bernhard
Bernhard

Reputation: 4417

Don't make it too complicated with dplyr. It is not friendly to the formula interface of t.test which is very helpfull in this particular situation. HEITZ has given an dplyr answer. Compare how the version without dplyr is not only more idiomatic but even shorter an features less nested parentheses:

by(df, df$eyecolor, function(subs) t.test(subs$moles ~ subs$suncream))

or, if you really only want to see p-values;

by(df, df$eyecolor, function(subs) t.test(subs$moles ~ subs$suncream)$p.value)

Upvotes: 0

HEITZ
HEITZ

Reputation: 136

This should probably be treated in an ANOVA context. Also, the OP should take some time to digest fundamentals of null hypothesis testing and t-tests if the answer is not clear. That said, here is an answer:

results = df %>% group_by(eyecolor) %>% summarize(p = t.test(moles[which(suncream == 'x')],moles[which(suncream=='y')])$p.value)

Upvotes: 1

Related Questions