dan
dan

Reputation: 6314

Overlaying ggplot2 histograms with different binwidths

I have data with a 2-level factor which I'd like to ggplot2 as overlaying histograms.

My data:

set.seed(1)
df <- data.frame(y = c(rnorm(1000),rnorm(10)), group = c(rep("A",1000),rep("B",10)))

my plot:

library(ggplot2)
ggplot(df, aes(y, fill = group)) + geom_histogram(alpha = 0.5, position = "identity")

The problem is that since the number of points for groups A and B is very different plotting them together with this code that uses the same binwidth is not ideal.

In fact it throws a warning:

stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Is there a way to plot overlaying histograms with different binwidths?

Upvotes: 1

Views: 812

Answers (2)

hrbrmstr
hrbrmstr

Reputation: 78852

You can also separate out the factors and apply different binwidths:

library(dplyr)
library(ggplot2)

set.seed(1)
df <- data.frame(y = c(rnorm(1000), rnorm(10)), 
                 group = c(rep("A", 1000), rep("B", 10)))

gg <- ggplot()
gg <- gg + geom_histogram(data=filter(df, group=="A"), 
                          aes(y, fill=group), 
                          alpha=0.5)
gg <- gg + geom_histogram(data=filter(df, group=="B"), 
                          aes(y, fill=group), 
                          binwidth=4, alpha=0.5)
gg

enter image description here

Upvotes: 2

csgillespie
csgillespie

Reputation: 60522

You need to work with density, i.e. get the area under the histogram to sum to 1. In base graphics you would set freq=FALSE in the hist function. For ggplot2 you can do:

ggplot(df, aes(y, fill = group)) + geom_histogram(aes(y=..density..))

or

ggplot(df, aes(y, fill = group)) + geom_density()

Upvotes: 1

Related Questions