How can I make this histogram of a dataset variable overplayed with it's normal distribution simpler and fancier?

Question

Okay, so I'm working with the classical iris dataset built into R. I am using the variable Sepal.Width within iris subsetted to the species "setosa". I plotted the probability histogram and it looks like it could follow a normal distribution (bell shaped curve), so I collected the mean of Sepal.Width and also the standard deviation of Sepal.Width, and that's all I need to plot the curve.

Normal/Gaussian Distribution formula:

f(x) = 1 / [σ * √(2 * π)] * e^[-(x - μ)^2 / (2 * σ^2)],

where e = Euler's number, π = pi, sigma (σ) = standard deviation, mu (μ) = mean.

So here is my code:

> hist(iris$Sepal.Width[iris$Species == "setosa"], probability = TRUE,
   main = "Histogram of Sepal Width for Setosa Species Overlayed with Normal Distribution", 
   xlab = "Sepal Width (cm)", ylab = "Probability", cex.main = 0.9)

> sepal_width_setosa <- iris$Sepal.Width[iris$Species == "setosa"]

> sepal_width_setosa_mean <- mean(sepal_width_setosa)

> sepal_width_setosa_mean
[1] 3.428

> sepal_width_setosa_sd <- sd(sepal_width_setosa)

> sepal_width_setosa_sd
[1] 0.3790644

> sepal_width_setosa_variance = var(sepal_width_setosa)

> range(sepal_width_setosa)
[1] 2.3 4.4

> sepal_width_setosa_gaussian_distribution <- function(x){
  1 / (sepal_width_setosa_sd*sqrt(2*pi))*exp(-(x - sepal_width_setosa_mean)^2 / (2 * sepal_width_setosa_variance))
}


> curve(sepal_width_setosa_gaussian_distribution, from = 2.0, to = 4.5,
   col = "darkblue", lwd = 2, add = TRUE)

And I'll do another visual test for normality:

> qqnorm(sepal_width_setosa, main = "Normal Q-Q Plot of Sepal Width for species setosa")

And a statistical test for normality

> library(nortest)
> ad.test(sepal_width_setosa)

Anderson-Darling normality test

data: sepal_width_setosa
A = 0.49096, p-value = 0.2102

Wow! The p-value is greater than 0.05, so I guess the variable isn't normally distributed.

Question: Is there a way to put this on a loop to get the histograms with its overlayed associated normal distribution for sepal width for all of the iris species, and then show the plots side by side? And how do I add standard deviation markers 1 to 3?

How can I make this histogram of a dataset variable overplayed with it's normal distribution simpler and fancier?

Answers (1)

Related Questions

How can I make this histogram of a dataset variable overplayed with it&#39;s normal distribution simpler and fancier?

Answers (1)

Related Questions

How can I make this histogram of a dataset variable overplayed with it's normal distribution simpler and fancier?