The Unfun Cat
The Unfun Cat

Reputation: 31938

How to stretch a normal curve in ggplot?

I have the following script, which makes my normal curve too small:

ggplot(exercise2d_df, aes(x=residuals_list)) + 
    geom_histogram(alpha=0.2, position="identity") + 
    stat_function(fun = dnorm, args = c(mean=mean(residuals_list), sd=sd(residuals_list)), size = 1, color = "red")

My data is:

residuals_list = c(0.183377698905335, 7.18337769890574, 1.18337769890566, 4.18337769890565, 5.18337769890565, 0.183377698905655, 3.18337769890566,-0.816622301094345, -2.81662230109434, 3.18337769890566, 8.18337769890566, 2.18337769890566, 4.18337769890565, 0.183377698905655, 5.18337769890565, -10.0541259982254, -9.05412599822537, -8.05412599822537, -5.05412599822537, -4.05412599822537, -3.05412599822537, -10.0541259982254, -6.05412599822537, -8.05412599822537, -7.05412599822537, -6.05412599822537, -7.05412599822537, -7.05412599822537, -5.05412599822537, -4.05412599822537, -3.05412599822537, -11.0541259982254, -9.05412599822537, -3.05412599822537, -1.05412599822537, -7.2916296953564, -8.2916296953564, -2.2916296953564, 0.708370304643597, -5.2916296953564, -3.2916296953564, -6.2916296953564, -2.2916296953564, 1.7083703046436, -5.2916296953564, -9.2916296953564, -5.2916296953564, -4.2916296953564, -4.2916296953564, -0.291629695356403, 1.18337769890566, -4.81662230109435, 0.183377698905655, 0.183377698905655, 0.183377698905655, 5.18337769890565, -0.816622301094345, -4.81662230109435, -3.81662230109434, -1.81662230109434, -0.816622301094345, 2.18337769890566, 3.18337769890566, 6.18337769890565, 8.18337769890566, 2.94587400177463, -3.05412599822537, 3.94587400177463, 4.94587400177463, 6.94587400177463, -0.0541259982253741, -0.0541259982253741, -0.0541259982253741, 0.945874001774626, 0.945874001774626, 0.945874001774626, 0.945874001774626, 3.94587400177463, 2.94587400177463, 0.945874001774626, 1.94587400177463, -3.05412599822537, 5.7083703046436, 4.7083703046436, 1.7083703046436, 11.7083703046436, 6.7083703046436, 7.7083703046436, 2.7083703046436, 3.7083703046436, 9.7083703046436, 8.7083703046436, 6.7083703046436, 6.7083703046436, -0.291629695356403, 5.7083703046436, 4.7083703046436, -1.2916296953564, 9.7083703046436, 8.7083703046436, 1.7083703046436, 2.7083703046436, 3.7083703046436)

This code creates a graph like the following:

resulting graph

How do I stretch the normal curve so that it fits the histogram?

(Notice that this is not a question about how to superimpose a normal curve to a histogram in ggplot, even though that is what I am ultimately after, so this is not a duplicate.)

Upvotes: 3

Views: 667

Answers (1)

Greg Snow
Greg Snow

Reputation: 49640

The current area under the normal curve is 1, the area of the histogram is the width of the bars times the number of points. So if you multiply the height of the normal curve by this value then it will have the same area. The following works (using the default binwidth calculation, it may be better/more direct to specify a binwidth):

tmpfun <- function(x,mean,sd) {
    diff(range(residuals_list))/30*length(residuals_list)*dnorm(x,mean,sd)
}


ggplot(exercise2d_df, aes(x=residuals_list)) + 
    geom_histogram(alpha=0.2, position="identity") + 
    stat_function(fun = tmpfun, args = c(mean=mean(residuals_list), 
        sd=sd(residuals_list)), size = 1, color = "red")

Upvotes: 2

Related Questions