Reputation: 11
I'm trying to overlay normal density curves over my stacked histograms in R using ggplot. bsa are numerical measures and they are recorded for two groups, treatment and control.
I have created stacked histograms for the two groups. I get an error with stat_function about the mapping needing to be a list of unevaluated mappings.
Any advice on how to do this would be appreciated.
ggplot(data=bsa, aes(x=bsa)) +geom_histogram(colours(distinct=TRUE)) + facet_grid(group~.) +
stat_function(dnorm(x, mean(bsa$bsa),sd(bsa$bsa)))+
ggtitle("Histogram of BSA amounts by group")
Upvotes: 1
Views: 2968
Reputation: 59425
Using stat_function(...)
with facets is tricky. stat_function(...)
takes an argument args=...
which needs to be a named list of the extra arguments to the function (so in your case, mean
and sd
). The problem is that these cannot appear in aes(...)
so you have to add the curves manually. Here is an example.
set.seed(1) # for reproducible example
df <- data.frame(bsa=rnorm(200, mean=rep(c(1,4),each=100)),
group=rep(c("test","control"),each=100))
# calculate mean and sd by group
stats <- aggregate(bsa~group, df, function(x) c(mean=mean(x), sd=sd(x)))
stats <- data.frame(group=stats[,1],stats[,2])
library(ggplot2)
ggplot(df, aes(x=bsa)) +
geom_histogram(aes(y=..density..,fill=group), color="grey30")+
with(stats[stats$group=="control",],stat_function(data=df[df$group=="control",],fun=dnorm, args=list(mean=mean, sd=sd)))+
with(stats[stats$group=="test",],stat_function(data=df[df$group=="test",],fun=dnorm, args=list(mean=mean, sd=sd)))+
facet_grid(group~.)
This is rather ugly, so I usually just calculae the curves external to ggplot
and add them using geom_line(...)
.
x <- with(df, seq(min(bsa), max(bsa), len=100))
dfn <- do.call(rbind,lapply(1:nrow(stats),
function(i) with(stats[i,],data.frame(group, x, y=dnorm(x,mean=mean,sd=sd)))))
ggplot(df, aes(x=bsa)) +
geom_histogram(aes(y=..density..,fill=group), color="grey30")+
geom_line(data=dfn, aes(x, y))+
facet_grid(group~.)
This makes the ggplot
code much more readable and produces pretty much the same thing.
Notice that if you wanted to overlay a kernel density estimate, rather than a normal curve, this would be a lot easier:
ggplot(df, aes(x=bsa)) +
geom_histogram(aes(y=..density..,fill=group), color="grey30")+
stat_density(geom="line")+
facet_grid(group~.)
Upvotes: 3