Sun Bear
Sun Bear

Reputation: 8234

Optimizing NumPy script that calculates Normal Distribution

I wrote the following NumPy-Python test script to get the normal distribution of certain inputs. I think I may not be using NumPy's vector operations efficiently to process the input data.
Can you show me the NumPy way of processing the inputs?

import numpy as np

#Inputs
mu = np.array( [1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='uint8' )
sigma = np.array( [1., 1., 1., 1., 1., 3., 3., 3., 3., 3.] )
number = np.array( [ 5, 10, 15, 20, 25, 25, 20, 15, 10, 5 ], dtype='uint16' )

#Processing
np.random.seed(0)
norms = [  np.random.normal(i, sigma[n], number[n]) for n, i in enumerate(mu) ]
print( norms )

a = np.concatenate( [ np.ceil(i) for i in norms ] )
print( a )

#Output
result = np.histogram( a, bins=np.arange(np.amin(a), np.amax(a)+1, 1, dtype='uint8' ) )
print( result )

Upvotes: 1

Views: 90

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

One way to vectorize your code is to generate the random standard normal sample and scale accordingly:

np.random.seed(0)

# random samples
samples = np.random.normal(size=number.sum())

# scale
samples = samples*sigma.repeat(number) + mu.repeat(number)

# equivalent to your `a`
out = np.ceil(samples)

# visualize to compare output:
fig, axes = plt.subplots(1, 2)

axes[0].hist(out, bins=np.arange(out.min(), out.max()+1))
axes[0].set_title('my code')

axes[1].hist(a, bins=np.arange(a.min(), a.max()+1))
axes[1].set_title('yours')

Output:

enter image description here

Upvotes: 2

Related Questions