Reputation: 375
Problem statement - Variable X has a mean of 15 and a standard deviation of 2.
What is the minimum percentage of X values that lie between 8 and 17?
I know about 68-95-99.7 empirical rule. From Google I found that percentage of values within 1.5 standard deviations is 86.64%. My code so far:
import scipy.stats
import numpy as np
X=np.random.normal(15,2)
As I understood,
13-17 is within 1 standard deviation having 68% values.
9-21 will be 3 standard deviations having 99.7% values.
7-23 is 4 standard deviations. So 8 is 3.5 standard deviations below the mean.
How to find the percentage of values from 8 to 17?
Upvotes: 0
Views: 4384
Reputation: 2249
You basically want to know the area under the Probability Density Function (PDF) from x1=8 to x2=17.
You know that the area of PDF is the integral, so it is Cumulative Density Function (CDF).
Thus, to find the area between two specific values of x you need to integrate the PDF between these values, which is equivalent to do CDF[x2] - CDF[x1].
So, in python, we could do
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
mu = 15
sd = 2
# define the distribution
dist = sps.norm(loc=mu, scale=sd)
x = np.linspace(dist.ppf(.00001), dist.ppf(.99999))
# Probability Density Function
pdf = dist.pdf(x)
# Cumulative Density Function
cdf = dist.cdf(x)
and plot to take a look
fig, axs = plt.subplots(1, 2, figsize=(12, 5))
axs[0].plot(x, pdf, color='k')
axs[0].fill_between(
x[(x>=8)&(x<=17)],
pdf[(x>=8)&(x<=17)],
alpha=.25
)
axs[0].set(
title='PDF'
)
axs[1].plot(x, cdf)
axs[1].axhline(dist.cdf(8), color='r', ls='--')
axs[1].axhline(dist.cdf(17), color='r', ls='--')
axs[1].set(
title='CDF'
)
plt.show()
So, the value we want is that area, that we can calculate as
cdf_at_8 = dist.cdf(8)
cdf_at_17 = dist.cdf(17)
cdf_between_8_17 = cdf_at_17 - cdf_at_8
print(f"{cdf_between_8_17:.1%}")
that gives 84.1%
.
Upvotes: 3