user305902
user305902

Reputation: 135

How to do T-tests between all categorical rows in a dataframe in R?

So I can manually do t-tests when its between columns but how would it to do t-tests across rows? I have the following example dataframe to demonstrate what I mean by doing t-tests across rows.

Fruit Sweetness Score
Apple 8
Apple 7
Apple 8
Banana 9
Banana 10
Banana 10
Banana 10
Kiwi 4
Kiwi 5
Kiwi 6

So how would I do a t-test to see if the mean sweetness of apples is different between bananas and kiwis? My actual data frame is 100+ rows long and has many more categories than just 3 but I want to figure it out for 3 items first row-wise. And is it possible to do t-tests automatically between all categories so Apples vs Bananas, Apples vs Kiwis, and Bananas vs Kiwis automatically without manually specifying the row names?

Upvotes: 1

Views: 72

Answers (1)

user2974951
user2974951

Reputation: 10375

I would do an ANOVA combined with a Tukey HSD test, which is more robust then performing many t-tests (you should of course check that the ANOVA assumptions are true in your case).

mod=aov(SweetnessScore~Fruit,data=df)
summary(mod)

            Df Sum Sq Mean Sq F value   Pr(>F)    
Fruit        2  38.68  19.342   39.63 0.000152 ***
Residuals    7   3.42   0.488                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Always check first if the variable as a whole is significant, and if true then

TukeyHSD(mod)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = SweetnessScore ~ Fruit, data = df)

$Fruit
                  diff        lwr        upr     p adj
Banana-Apple  2.083333  0.5118688  3.6547979 0.0141688
Kiwi-Apple   -2.666667 -4.3466329 -0.9867004 0.0055946
Kiwi-Banana  -4.750000 -6.3214646 -3.1785354 0.0001165

Upvotes: 1

Related Questions