Multiple normal distributions by factor in ggplot facet_wrap()

Question

I got the following code and its working fine. Except that I can't manage to address the correct mean and sd in the stat_function() of the relevant factor variable to draw the appropiate normal distribution curve over the histogram.

p <- ggplot(data = df, aes(x=DELY_QTY)) + 
  geom_histogram(aes(x=DELY_QTY, y=..density..), color="#76C0C1", fill="#76C0C1", bins=30)+
  stat_function(fun=dnorm, args = list(mean=mean(df$DELY_QTY), sd=sd(df$DELY_QTY)), color="#C10534", size=2, alpha=0.75)+
  stat_density(geom = "line", color="#1A476F", size=2, alpha=0.75)+
  facet_wrap(~PIA_ITEM, scales = "free")

The internal structure of the data frame looks like this:

'data.frame':   66333 obs. of  2 variables:
 $ PIA_ITEM: Factor w/ 7 levels "GH26 2.6t Typ 1172-89",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ DELY_QTY: int  43 37 41 73 34 53 47 51 43 34 ...

How can I address the list(mean=mean(df$DELY_QTY), sd=sd(df$DELY_QTY)) properly ?

structure(list(PIA_ITEM = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("GH26 2.6t Typ 1172-89", 
"GH26 11,6t Typ 3611", "GH26 13,6t Typ 3621", "GH26 5,9t Typ 3613", 
"GH26 29,0t Typ 3615", "GH26 24,0t Typ 3625", "GH26 5,2t Typ 3630"
), class = "factor"), DELY_QTY = c(43L, 37L, 41L, 73L, 34L, 53L, 
47L, 51L, 43L, 34L, 30L, 44L, 51L, 84L, 16L, 24L, 12L, 11L, 20L, 
20L)), row.names = c(NA, 20L), class = "data.frame")

teunbrand · Accepted Answer

I had written a function at some point to adress these types of issues. I've put it in the package ggh4x. Here is a (slightly simplified) example:

library(ggplot2)
library(ggh4x)

ggplot(data = df, aes(x = DELY_QTY)) +
  geom_histogram(aes(y = after_stat(density)),
                 alpha = 0.5, bins = 30) +
  stat_density(geom = "line") +
  stat_theodensity(colour = "red") +
  facet_wrap(~ PIA_ITEM, scales = "free")

Multiple normal distributions by factor in ggplot facet_wrap()

Answers (2)

Related Questions