Kurt Peek
Kurt Peek

Reputation: 57461

Calculate the softmax of an array column-wise using numpy

Following https://classroom.udacity.com/courses/ud730/lessons/6370362152/concepts/63815621490923, I'm trying to write a "softmax" function which, when given a 2-dimensional array as input, calculates the softmax of each column. I wrote the following script to test it:

import numpy as np

#scores=np.array([1.0,2.0,3.0])

scores=np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]])

def softmax(x):
    if x.ndim==1:
        S=np.sum(np.exp(x))
        return np.exp(x)/S
    elif x.ndim==2:
        result=np.zeros_like(x)
        M,N=x.shape
        for n in range(N):
            S=np.sum(np.exp(x[:,n]))
            result[:,n]=np.exp(x[:,n])/S
        return result
    else:
        print("The input array is not 1- or 2-dimensional.")

s=softmax(scores)
print(s)

However, the result "s" turns out to be an array of zeros:

[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

If I remove the "/S" in the for-loop, the 'un-normalized' result is as I would expect it to be; somehow the "/S" division appears to make all the elements zero instead dividing each element by S as I would expect it to. What is wrong with the code?

Upvotes: 3

Views: 4460

Answers (3)

gboffi
gboffi

Reputation: 25023

The problem in your code is how you instantiate the placeholder for the results that you're about to compute, that is

    result=np.zeros_like(x)

because if x is an array of integers, also result is an array of integers and when you assign to it,

        result[:,n]=np.exp(x[:,n])/S

a conversion to integer is enforced. When you normalize dividing by S all the numbers converted to integers are in the interval (0, 1], the conversion is done truncating towards zero and so you have an array of zeros.

You said that, if you don't normalize, result is different from zero... that's because in this case you convert to integers numbers LARGER than 1.

A possible solution, that you can use in your code as is, consists in instantiating an array of float, irrispective of the type of x

    result=np.zeros(x.shape)

but I have to say that your code computes the exponential twice and uses loops where you could use vectorized operations.

Here it is a different implementation that (a) avoids loops and (b) avoids unnecessary evaluations of the exponential,

def sm(a):
    s = np.exp(a)
    if a.ndim == 1:
        return s/s.sum()
    elif a.ndim == 2:
        return s/s.sum(0) 
    else:
        return

A small test,

In [32]: sm(np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]]))
Out[32]: 
array([[ 0.09003057,  0.00242826,  0.01587624,  0.33333333],
       [ 0.24472847,  0.01794253,  0.11731043,  0.33333333],
       [ 0.66524096,  0.97962921,  0.86681333,  0.33333333]])

In [33]: 

Note that it works perfectly also with an integer array as input.

Addendum

Following the suggestion from n13 the function can be rewritten as

def sm(a):
    s = np.exp(a)
    if a.ndim <3: return s/s.sum(0) 

Thank you n13.

PS when I wrote the addendum I had not realized that n13 had posted an answer on its own...

Upvotes: 3

n13
n13

Reputation: 6983

Numpy has some nifty matrix operations that makes this problem a lot easier and simpler to solve.

Calculating the exponential works on a matrix of any dimension

the sum() method takes an argument axis which allows us to restrict the sum to a given axis - columns maps to axis 0 in our case.

def softmax(x):
    exp = np.exp(x) # exp just calculates exp for all elements in the matrix
    return exp / exp.sum(0) # sum axis = 0 argument sums over axis representing columns

Upvotes: 2

Kurt Peek
Kurt Peek

Reputation: 57461

The reason for the "zeros" lies in the data type of the inputs, which are of the "int" type. Converting the input to "float" solved the problem:

import numpy as np

#scores=np.array([1.0,2.0,3.0])

scores=np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]])

def softmax(x):
    x=x.astype(float)
    if x.ndim==1:
        S=np.sum(np.exp(x))
        return np.exp(x)/S
    elif x.ndim==2:
        result=np.zeros_like(x)
        M,N=x.shape
        for n in range(N):
            S=np.sum(np.exp(x[:,n]))
            result[:,n]=np.exp(x[:,n])/S
        return result
    else:
        print("The input array is not 1- or 2-dimensional.")

s=softmax(scores)
print(s)

Note that I've added "x=x.astype(float)" to the first line of the function definition. This yields the expected output:

[[ 0.09003057  0.00242826  0.01587624  0.33333333]
 [ 0.24472847  0.01794253  0.11731043  0.33333333]
 [ 0.66524096  0.97962921  0.86681333  0.33333333]]

Upvotes: 6

Related Questions