Reputation: 89
I'm currently using the following code:
prob = scipy.stats.norm(mu, np.sqrt(sigma)).pdf(o) # korrekt
return prob
I don't think there is much use for me to explain the variables, but mu is the expected values, sigma is my variance and o is an observation, and I want to find the probability of the given observation. It is working, but it is very slow since I call it a lot of times, and I got a lot faster result from just writing the normal distribution and attaining the probability from that.
Is there a smarter way for me to call this function?
Upvotes: 0
Views: 1024
Reputation: 2417
Two approaches...
Vectorisation
Takes advantage of the fact scipy/numpy perform calculations on arrays...
import numpy as np
from scipy.stats import norm
observations = np.random.rand(1000)
mu = np.mean(observations)
sigma = np.var(observations)
norm(mu, np.sqrt(sigma)).pdf(observations)
List Comprehension
This is a lot slower, but if your observations are in a list then you can...
list_of_observations = list(np.random.rand(1000))
mu = np.mean(list_of_observations)
sigma = np.var(list_of_observations)
prob = [norm(mu, np.sqrt(sigma)).pdf(o) for o in list_of_observations]
...but its easy to convert a list to an array and use the former solution as you can use np.asarray() to convert the list to an array...
norm(mu, np.sqrt(sigma)).pdf(np.asarray(list_of_observations))
Note also that if you are calculating the variance (sigma
) yourself then rather than using np.var()
you can get the standard deviation directly using np.std()
.
Upvotes: 1