AML
AML

Reputation: 43

How do I plot error bars from a second dataframe of summary stats onto my plotted raw data, ggplot2?

I've got my plot how I like it as far as the raw data is concerned. Basically I want something like a bar chart but instead of a filled in bar you can see the raw data points and their confidence intervals. Right now I have all the plotting from my long-format dataframe (dat) with raw values. I have summary stats and high and low confidence intervals for each point cloud in a separate dataframe (sum_stats)

The original plot call:

dat %>% subset(y_var > 0) %>%
  ggplot(aes(x = x_var, y = y_var, group = Treatment, col = Treatment, shape = Treatment)) +
  geom_point(position = position_jitterdodge(jitter.width = 1.5, dodge.width = 1)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  facet_wrap(~comp_group)

When I tried adding geom_error(data = sum_stats, aes(x=x_var, ymin=ci_low, ymax=ci_high), I got an error

Error in FUN(X[[i]], ...) : object 'y_var' not found

When I take the geom_error() call back out, the plot comes up no problem.

The image below is 2 of the 4 facets of the plot, but without the confidence intervals over the clouds of data points. How do I get the confidence intervals onto each cloud of points? enter image description here

Upvotes: 1

Views: 518

Answers (1)

dc37
dc37

Reputation: 16178

Without a reproducible example of your data, I can't be sure that my solution is working for your particular dataset. Moreover, I think you made a typo because up to my knowledge, geom_error does not exist (geom_errorbar yes).

However, if you are looking for plotting the confidence intervals onto each cloud of points you can use geom_pointrange. Here an example using the iris dataset.

1) Generating the statitical summary from the iris dataset

library(dplyr)
iris_stat <- iris %>% group_by(Species) %>% 
  summarise(Mean = mean(Sepal.Length), SD = sd(Sepal.Length)) %>%
  mutate(Upper = Mean+SD, Lower = Mean-SD)

# A tibble: 3 x 5
  Species     Mean    SD Upper Lower
  <fct>      <dbl> <dbl> <dbl> <dbl>
1 setosa      5.01 0.352  5.36  4.65
2 versicolor  5.94 0.516  6.45  5.42
3 virginica   6.59 0.636  7.22  5.95

2) Then, we can plot the cloud of points using geom_jitter and the confidence intervals using geom_pointrange:

library(ggplot2)
ggplot() +
  geom_jitter(data = iris, aes(x = Species, y = Sepal.Length, color = Species, shape = Species),width = 0.2, size = 2)+
  geom_pointrange(data = iris_stat, aes(x = Species, y = Mean, ymin = Lower, ymax = Upper))

And you get the following plot: enter image description here

Is it what you are looking for ?

If not, please consider to provide a reproducible example of your data (see: How to make a great R reproducible example)

Upvotes: 1

Related Questions