mchangun
mchangun

Reputation: 10362

Vectorize this function in Numpy Python

I have an array of 60,000 numbers from 0-9:

In [1]: trainY
Out[1]: 
array([[5],
       [0],
       [4],
       ..., 
       [5],
       [6],
       [8]], dtype=int8)

And I have a function to transform each element in trainY into a 10 element vector as per below:

0 -> [1,0,0,0,0,0,0,0,0,0]
1 -> [0,1,0,0,0,0,0,0,0,0]
2 -> [0,0,1,0,0,0,0,0,0,0]
3 -> [0,0,0,1,0,0,0,0,0,0]
...
9 -> [0,0,0,0,0,0,0,0,0,1]

The function:

def transform_y(y):
    new_y = np.zeros(10)
    new_y[y] = 1
    return new_y

My code only works 1 element at a time. What's the best way to transform my trainY array all at once (other than a for loop)? Should I use map? Can someone also show me how to re-write the function so that's it's vectorised?

Thank you.

Upvotes: 1

Views: 3514

Answers (2)

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 59005

You can considerably improve your code speed creating an 2-D array with ones along the diagonal and then extract the right rows based on the input array:

a = array([[5],
           [0],
           [4],
           ..., 
           [5],
           [6],
           [8]], dtype=int8)

new_y = np.eye(a.max()+1)[a.ravel()]

An even faster solution would be to create the output array with zeros and then populate it according to the indices from a:

new_y = np.zeros((a.shape[0], a.max()+1))
new_y[np.indices(a.ravel().shape)[0], a.ravel()] = 1.

Upvotes: 4

Bruce
Bruce

Reputation: 7132

You can use the vectorizedecorator

@np.vectorize
def transform_y(y):
    new_y = np.zeros(10)
    new_y[y] = 1
    return new_y

see http://telliott99.blogspot.ch/2010/03/vectorize-in-numpy.html

Upvotes: 3

Related Questions