ANOVA - comparing 3 groups in R

Question

I am attempting to analyze a data set for a research project but have ran into a lot of issues, and have not been able to find a directly related answer online. I have worked with other statistical programs but am new to R. I have had the hardest time figuring out how to shape my data set to best answer questions.

In this research participants were asked to answer questions about pictures they were presented, these pictures were of faces exhibiting 3 emotions (happy, angry, sad) - I now want to compare answers given to each question in regards to those pictures. Meaning I want to see if there are differences between these three groups.

I have used a 1 way ANOVA in the past for doing this - in minitab I would put the images into 3 factors (1,2,3) and then the scores for the given question in the column next to it. So the specific picture and the score for the particular question would be lined up horizontally.

  Image pleasing
1     1        3
2     1        2
3     1        1
4     1        1
5     1        1
6     1        2

This is how I have it set up in R as well - but when I try to run an ANOVA I cannot because image is still the class of Integer and not a factor. Therefor it gives me this:

> Paov <- aov(Image ~ pleasing)
> summary(Paov)
             Df Sum Sq Mean Sq F value Pr(>F)
pleasing      1    0.7  0.6546   0.978  0.323
Residuals   813  544.3  0.6696               
26 observations deleted due to missingness

and then a post-hoc Tukey's test is meaningless. In minitab it was able to show me the mean score for pleasing as it related to each image and then tell me how they are significantly different. How can I make Image a factor in R? And then how can I properly compare these three groups in there scores of pleasing?

Molx · Accepted Answer

Given the description of your data, here's a way to perform the analysis of variance and the Tukey test. First, some not-so-random data (which will give "interesting" results):

set.seed(40)
dat <- data.frame(Image = factor(rep(1:3, each=10)), 
                  Pleasing = c(sample(1:2, 10, replace=T),
                               sample(c(1,3), 10, replace=T),
                               sample(2:3, 10, replace=T)))
head(dat)
#   Image Pleasing
# 1     1        2
# 2     1        2
# 3     1        2
# 4     1        1
# 5     1        1
# 6     1        1

The aov is quite simple. Just note you have to use data if your variables are in a dataframe (using attach isn't recommended):

dat.aov <- aov(Pleasing ~ Image, data=dat)
summary(dat.aov)
#             Df Sum Sq Mean Sq F value  Pr(>F)   
# Image        2    7.2   3.600   6.568 0.00474 **
# Residuals   27   14.8   0.548                   
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now for the Tukey, there are different ways do it in R. I like to use the package multcomp because it provides more information with the results:

library(multcomp)

tukey <- cld(glht(dat.aov, linfct = mcp(Image = "Tukey")), decreasing = TRUE)

tukey$mcletters$Letters
#  1    2    3 
# "b" "ab"  "a"

The syntax looks rather complicated because in multcomp you use a general linear hypothesis function (glht), in which you perform a multiple comparison (mcp) and then extract the compact letter display of the Tukey results (cld).

You can even plot the Tukey results, although the boxplots don't look very nice for this kind of data:

enter image description here

As a final note, it's important to mention that I use this kind of analysis for continuous data (experimental lab measures), and I'm not sure it's correct for your categorical data (1-3 expression choice).

ANOVA - comparing 3 groups in R

Answers (1)

Related Questions