isedgar
isedgar

Reputation: 53

Efficient way to create the probability distribution of a list of numbers with numpy

This is an example of what I am trying to do. Suppose the following numpy array:

A = np.array([3, 0, 1, 5, 7]) # in practice, this array is a huge array of float numbers: A.shape[0] >= 1000000

I need the fastest possible way to get the following result:

result = []

for a in A:
    result.append( 1 / np.exp(A - a).sum() )

result = np.array(result)

print(result)

>>> [1.58297157e-02 7.88115138e-04 2.14231906e-03 1.16966657e-01 8.64273193e-01]

Option 1 (faster than previous code):

result = 1 / np.exp(A - A[:,None]).sum(axis=1)

print(result)

>>> [1.58297157e-02 7.88115138e-04 2.14231906e-03 1.16966657e-01 8.64273193e-01]

Is there a faster way to get "result" ?

Upvotes: 0

Views: 1817

Answers (2)

isedgar
isedgar

Reputation: 53

Yes: scipy.special.softmax did the trick

from scipy.special import softmax

result = softmax(A)

Thank you @j1-lee and @Karl Knechtel

Upvotes: 1

Karl Knechtel
Karl Knechtel

Reputation: 61643

Rather than trying to compute each value by normalizing it in place (effectively adding up all the values, repeatedly for each value), instead just get the exponentials and then normalize once at the end. So:

raw = np.exp(A)
result = A / sum(A)

(In my testing, the builtin sum is over 2.5x as fast as np.sum for summing a small array. I did not test with larger ones.)

Upvotes: 3

Related Questions