Reputation: 471
As the title indicates I am trying to plot the normal distribution and the binomial distribution in the same plot using R. My attempt can be seen below, is there any reason why my normal distribution looks so off? I have double checked the mean and standard deviation and everything looks fine.
n <- 151
p <- 0.2409
dev <- 4
mu <- n*p
sigma <- sqrt(n*p*(1 - p))
xmin <- round(max(mu - dev*sigma,0));
xmax <- round(min(mu + dev*sigma,n))
x <- seq(xmin, xmax)
y <- dbinom(x,n,p)
barplot(y,
col = 'lightblue',
names.arg = x,
main = 'Binomial distribution, n=151, p=.803')
range <- seq(mu - dev*sigma, mu + dev*sigma, 0.01)
height <- dnorm(range, mean = mu, sd = sigma)
lines(range, height, col = 'red', lwd = 3)
Upvotes: 2
Views: 4049
Reputation: 20811
barplot
is just the wrong function for your case. Or if you really want to use it, you'd have to rejigger the x-axes between barplot
and lines
The default for barplot
is to put each height
value at
head(c(barplot(y, plot = FALSE)))
# [1] 0.7 1.9 3.1 4.3 5.5 6.7
This can be changed by your choices of space
and width
or a combination of both
head(c(barplot(y, plot = FALSE, space = 0)))
# [1] 0.5 1.5 2.5 3.5 4.5 5.5
head(c(barplot(y, plot = FALSE, space = 0, width = 3)))
# [1] 1.5 4.5 7.5 10.5 13.5 16.5
You can just use plot
to avoid dealing with those things
n <- 151
p <- 0.2409
dev <- 4
mu <- n*p
sigma <- sqrt(n*p*(1 - p))
xmin <- round(max(mu - dev*sigma,0));
xmax <- round(min(mu + dev*sigma,n))
x <- seq(xmin, xmax)
y <- dbinom(x,n,p)
plot(x, y, type = 'h', lwd = 10, lend = 3, col = 'lightblue',
ann = FALSE, las = 1, bty = 'l', yaxs = 'i', ylim = c(0, 0.08))
title(main = sprintf('Binomial distribution, n=%s, p=%.3f', n, p))
lines(x, dnorm(x, mean = mu, sd = sigma), col = 'red', lwd = 7)
xx <- seq(min(x), max(x), length.out = 1000)
lines(xx, dnorm(xx, mean = mu, sd = sigma), col = 'white')
The "bars" in this figure depend on your choice of lwd
and your device dimensions, but if you need finer control over that, you can use rect
which takes a little more work.
w <- 0.75
plot(x, y, type = 'n', ann = FALSE, las = 1, bty = 'l', yaxs = 'i', ylim = c(0, 0.08))
rect(x - w / 2, 0, x + w / 2, y, col = 'lightblue')
lines(xx, dnorm(xx, mean = mu, sd = sigma), col = 'red', lwd = 3)
title(main = sprintf('Binomial distribution, n=%s, p=%.3f', n, p))
Upvotes: 4
Reputation: 2613
You can use the ggplot2
package
library(ggplot2)
n <- 151
p <- 0.2409
mean <- n*p
sd <- sqrt(n*p*(1-p))
binwidth <- 0.005
xmin <- round(max(mu - dev*sigma,0));
xmax <- round(min(mu + dev*sigma,n))
x <- seq(xmin, xmax)
y <- dbinom(x,n,p)
df <- cbind.data.frame(x, y)
ggplot(df, aes(x = x, y = y)) +
geom_bar(stat="identity", fill = 'dodgerblue3')+
labs(title = "Binomial distribution, n=151, p=.803",
x = "",
y = "") +
theme_minimal()+
# Create normal curve, akousting for number of observations and binwidth
stat_function(
fun = function(x, mean, sd, n, bw){
dnorm(x = x, mean = mean, sd = sd)
}, col = "red", size=I(1.4),
args = c(mean = mean, sd = sd, n = n, bw = binwidth))
Upvotes: 2
Reputation: 1763
You could do it using the ggplot2
package (I was surprised by the normal distribution but replacing geom_line by geom_point convinced me that is has this form (is the variance too high ?)) :
n <- 151
p <- 0.2409
dev <- 4
mu <- n*p
sigma <- sqrt(n*p*(1 - p))
xmin <- round(max(mu - dev*sigma,0));
xmax <- round(min(mu + dev*sigma,n))
x <- seq(xmin, xmax)
y <- dbinom(x,n,p)
z <- dnorm(x = qnorm(p = seq(0,1, length.out = length(x)), mean = mu, sd = sigma), mean = mu, sd = sigma)
library(magrittr)
library(ggplot2)
data.frame(x, y, z) %>%
ggplot(aes(x = x)) +
geom_col(aes(y = y)) +
geom_line(aes(x = x, y = z, colour = "red"),
show.legend = FALSE)
Upvotes: 0