Dnaiel
Dnaiel

Reputation: 7832

Plotting confidence intervals in ggplot

I'd like to do the following plot using ggplot:

GgPlot of Confidence Intervals by Group

Here is an example of the structure of my df (sort of, draw not to scale with the data):

example.df = data.frame(mean = c(0.3,0.8,0.4,0.65,0.28,0.91,0.35,0.61,0.32,0.94,0.1,0.9,0.13,0.85,0.7,1.3), 
                            std.dev = c(0.01,0.03,0.023,0.031,0.01,0.012,0.015,0.021,0.21,0.13,0.023,0.051,0.07,0.012,0.025,0.058),
                            class = c("1","2","1","2","1","2","1","2","1","2","1","2","1","2","1","2"),
                            group = c("group1","group2","group1","group2","group1","group2","group1","group2","group1","group2","group1","group2","group1","group2","group1","group2"))

This data frame consists of 16 replicates, each with a given mean and a given standard deviation.

For each replicate I'd like to plot the confidence intervals, where the big dot in my figure example is the mean estimate, and the length of the bar is twice the standard deviation.

Also I'd like to plot two different replicates in the same line but with different coloring, coloring it by class, red is class 1 and blue is class 2.

Finally, I'd like to divide the whole plot into two panels (in the same row) corresponding to the two different groups.

I tried looking into this site, http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ but couldn't figure out how to automate this for any data frame of this structure, with X number of groups (in this case 2), and K replicates per group (in this case 8, 4 of class 1 and 4 of class 2).

Is there a good way to do this using ggplot or standard r pkg libraries?

Upvotes: 5

Views: 16410

Answers (2)

Denis Cousineau
Denis Cousineau

Reputation: 497

If you have the raw data instead of a compilation of the means and the standard deviations per cells, you could use superb (summary plot with error bars) which aggregate the scores automatically.

library(superb)

Let's simulate data to illustrate my point:

raw.df <- GRD( SubjectsPerGroup = 20, 
    BSFactors = c("class(1,2)", "group(group1,group2)", "replicate(4)")
)
head(raw.df)

The dependent variable is stored in the column DV.

superb(DV ~ replicate + class + group, raw.df, 
       plotStyle="point" )

Here, replicate named first will be on the x-axis, class will be of different colors, and the last factor named, group will be in different panels. The result is

Mean plot of the 4 x 2 x 2 dataset

You can then add additional graphic directives, e.g.,

superb(DV ~ replicate + class + group, raw.df, 
       plotStyle="point" ) + 
theme_bw() + 
ylab("My variable")

Upvotes: 0

Didzis Elferts
Didzis Elferts

Reputation: 98579

I suppose that sample data frame you provided isn't build in appropriate way because all values in group1 have class 1, and in group2 all are class 2. So I made new data frame, added also new column named replicate that shows number of replicate (four replicates (with two class values) in each group).

example.df = data.frame(mean = c(0.3,0.8,0.4,0.65,0.28,0.91,0.35,0.61,0.32,0.94,0.1,
                                0.9,0.13,0.85,0.7,1.3), 
                        std.dev = c(0.01,0.03,0.023,0.031,0.01,0.012,0.015,0.021,0.21,
                                  0.13,0.023,0.051,0.07,0.012,0.025,0.058),
                        class = c("1","2","1","2","1","2","1","2","1","2","1",
                                 "2","1","2","1","2"),
                        group = rep(c("group1","group2"),each=8),
                        replicate=rep(rep(1:4,each=2),time=2))

Now you can use geom_pointrange() to get points with confidence intervals and facet_wrap() to make plot for each group.

ggplot(example.df,aes(factor(replicate),
               y=mean,ymin=mean-2*std.dev,ymax=mean+2*std.dev,color=factor(class)))+
  geom_pointrange()+facet_wrap(~group)

enter image description here

Upvotes: 8

Related Questions