Dr. Andi Lowe
Dr. Andi Lowe

Reputation: 510

R + ggplot2, multiple histograms in the same plot with each histogram normalised to unit area?

Sorry for the newbie R question...

I have a data.frame that contains measurements of a single variable. These measurements will be distributed differently depending on whether the thing being measured is of type A or type B; that is, you can imagine that my column names are: measurement, type label (A or B). I want to plot the histograms of the measurements for A and B separately, and put the two histograms in the same plot, with each histogram normalised to unit area (this is because I expect the proportions of A and B to differ significantly). By unit area, I mean that A and B each have unit area, not that A+B have unit area. Basically, I want something like geom_density, but I don't want a smoothed distributions for each; I want the histogram bars. Not interleaved, but plotted one on top of the other. Not stacked, although it would be interesting to know how to do this also. (The purpose of this plot is to explore differences in the shapes of the distributions that would indicate that there are quantitative differences between A and B that could be used to distinguish between them.) That's all. Two or more histograms -- not smoothed density plots -- in the same plot with each normalised to unit area. Thanks!

Upvotes: 2

Views: 4871

Answers (1)

jlhoward
jlhoward

Reputation: 59365

Something like this?

# generate example
set.seed(1)
df <- data.frame(Type=c(rep("A",1000),rep("B",4000)),
                 Value=c(rnorm(1000,mean=25,sd=10),rchisq(4000,15)))
# you start here...
library(ggplot2)
ggplot(df, aes(x=Value))+
  geom_histogram(aes(y=..density..,fill=Type),color="grey80")+
  facet_grid(Type~.)

Note that there are 4 times as many samples of type B.

You can also set the y-axis scales to float using: scales="free_y" in the call to facet_grid(...).

Upvotes: 7

Related Questions