Reputation: 403
I have the following code below that prints the PDF graph for a particular mean and standard deviation.
Now I need to find the actual probability, of a particular value. So for example if my mean is 0, and my value is 0, my probability is 1. This is usually done by calculating the area under the curve. Similar to this:
http://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
I am not sure how to approach this problem
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def normal(power, mean, std, val):
a = 1/(np.sqrt(2*np.pi)*std)
diff = np.abs(np.power(val-mean, power))
b = np.exp(-(diff)/(2*std*std))
return a*b
pdf_array = []
array = np.arange(-2,2,0.1)
print array
for i in array:
print i
pdf = normal(2, 0, 0.1, i)
print pdf
pdf_array.append(pdf)
plt.plot(array, pdf_array)
plt.ylabel('some numbers')
plt.axis([-2, 2, 0, 5])
plt.show()
print
Upvotes: 20
Views: 109484
Reputation: 165
If you want to write it from scratch:
class PDF():
def __init__(self,mu=0, sigma=1):
self.mean = mu
self.stdev = sigma
self.data = []
def calculate_mean(self):
self.mean = sum(self.data) // len(self.data)
return self.mean
def calculate_stdev(self,sample=True):
if sample:
n = len(self.data)-1
else:
n = len(self.data)
mean = self.mean
sigma = 0
for el in self.data:
sigma += (el - mean)**2
sigma = math.sqrt(sigma / n)
self.stdev = sigma
return self.stdev
def pdf(self, x):
return (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean) / self.stdev) ** 2)
Upvotes: 11
Reputation:
The area under a curve y = f(x)
from x = a
to x = b
is the same as the integral of f(x)dx
from x = a
to x = b
. Scipy has a quick easy way to do integrals. And just so you understand, the probability of finding a single point in that area cannot be one because the idea is that the total area under the curve is one (unless MAYBE it's a delta function). So you should get 0 ≤ probability of value < 1
for any particular value of interest. There may be different ways of doing it, but a conventional way is to assign confidence intervals along the x-axis like this. I would read up on Gaussian curves and normalization before continuing to code it.
Upvotes: 5
Reputation: 2764
Unless you have a reason to implement this yourself. All these functions are available in scipy.stats.norm
I think you asking for the cdf, then use this code:
from scipy.stats import norm
print(norm.cdf(x, mean, std))
Upvotes: 17