Reputation: 6314
I have data with a 2-level factor which I'd like to ggplot2 as overlaying histograms.
My data:
set.seed(1)
df <- data.frame(y = c(rnorm(1000),rnorm(10)), group = c(rep("A",1000),rep("B",10)))
my plot:
library(ggplot2)
ggplot(df, aes(y, fill = group)) + geom_histogram(alpha = 0.5, position = "identity")
The problem is that since the number of points for groups A and B is very different plotting them together with this code that uses the same binwidth is not ideal.
In fact it throws a warning:
stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Is there a way to plot overlaying histograms with different binwidths?
Upvotes: 1
Views: 812
Reputation: 78852
You can also separate out the factors and apply different binwidth
s:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- data.frame(y = c(rnorm(1000), rnorm(10)),
group = c(rep("A", 1000), rep("B", 10)))
gg <- ggplot()
gg <- gg + geom_histogram(data=filter(df, group=="A"),
aes(y, fill=group),
alpha=0.5)
gg <- gg + geom_histogram(data=filter(df, group=="B"),
aes(y, fill=group),
binwidth=4, alpha=0.5)
gg
Upvotes: 2
Reputation: 60522
You need to work with density, i.e. get the area under the histogram to sum to 1. In base graphics you would set freq=FALSE
in the hist
function. For ggplot2 you can do:
ggplot(df, aes(y, fill = group)) + geom_histogram(aes(y=..density..))
or
ggplot(df, aes(y, fill = group)) + geom_density()
Upvotes: 1