RPlotter
RPlotter

Reputation: 109

How to plot multiple group means and the confidence intervals in ggplot2 (R)?

I have data that looks like this:

A  B  C
8  5  2
9  3  1
1  2  3
3  1  2
4  3  1

I need to plot the means of each of these along with the confidence intervals using ggplot2. I also want to derive the confidence intervals from the data iteself (eg. using stat_summary(fun.data = mean_cl), however I am not sure how I can plot the means for the data from this format.

I tried the following code, but it does not run. I am not sure what needs to go into the y in line 2.

pd <- position_dodge(0.78)
ggplot(dat, y = c(dat$A,dat$B,dat$C) + ylim(0,10) + theme_bw()) + 
  stat_summary(geom="bar", fun.y=mean, position = "dodge") + 
  stat_summary(geom="errorbar", fun.data=mean_cl_normal, position = pd)

I get the following error:

Warning messages:
1: Computation failed in `stat_summary()`:
object 'x' not found 
2: Computation failed in `stat_summary()`:
object 'x' not found

Upvotes: 0

Views: 17651

Answers (2)

Nate
Nate

Reputation: 10671

like David said, you need long format first, but you should be able to use fun.data = "mean_cl_normal" or plug in various others just fine like this:

library(tidyr); library(ggplot2)
dat <- gather(dat) # gather to long form

ggplot(data = dat, aes(x = key, y = value)) +
    geom_point(size = 4, alpha = .5) + # always plot the raw data
    stat_summary(fun.data = "mean_cl_normal", geom = "crossbar") +
    labs(title = "95% Mean Confidence Intervals")

enter image description here

If you want to build the same intervals manually all you need are lm and confint to get the information you are after:

mod <- lm(value ~ 0 + key, data = dat)
ci <- confint(mod)

Upvotes: 3

David
David

Reputation: 610

Your data isn't in long format, meaning that it should look like this:

thing<-data.frame(Group=factor(rep(c("A","B","C"),5)),
                  Y = c(8,9,1,3,4, 
                        5,3,2,1,3,
                        2,1,3,2,1)
                  )

You can use a function like melt() to help with getting the data formatted in the reshape2 package.

Once you have that, you also have to calculate the means and SEs for your data (by hand prior to ggplot or by the correct expressions within stat_summary in ggplot). You may have copied/pasted from an example because the functions that you're using (eg, mean_cl_normal) are possibly undefined.

Let's do it by hand then.

library(plyr)

cdata <- ddply(thing, "Group", summarise,
               N    = length(Y),
               mean = mean(Y),
               sd   = sd(Y),
               se   = sd / sqrt(N)
)
cdata

#Group N mean       sd       se
#1     A 5  4.0 2.236068 1.000000
#2     B 5  3.8 3.033150 1.356466
#3     C 5  1.8 1.788854 0.800000

Now you can use ggplot.

pd <- position_dodge(0.78)

ggplot(cdata, aes(x=Group, y = mean, group = Group)) +
   #draws the means
      geom_point(position=pd) +
   #draws the CI error bars
      geom_errorbar(data=cdata, aes(ymin=mean-2*se, ymax=mean+2*se, 
      color=Group), width=.1, position=pd)

This gives the attached plot.

Mean and CI Plot

Upvotes: 7

Related Questions