Hossein
Hossein

Reputation: 25924

How can I vectorize this (numpy) operation in python?

I have two vectors of shape (batch, dim), which I'm trying to subtract from one another. Currently I am using a simple loop to subtract a specific entry in a vector (i.e. error) based on the second vector(i.e. label),from 1 :

per_ts_loss=0
for i, idx in enumerate(np.argmax(label, axis=1)):
    error[i, idx] -=1
    per_ts_loss += error[i, idx]

How Can I vectorize this?

For an example, error and label can look like this :

error :
array([[ 0.5488135   0.71518937  0.60276338  0.54488318  0.4236548 ]
       [ 0.64589411  0.43758721  0.891773    0.96366276  0.38344152]])
label:
    array([[0, 0, 0, 1, 0 ],
           [0, 1, 0, 0, 0]])

for this example, running the code below results in the following outcomes:

for i, idx in enumerate(np.argmax(label,axis=1)):
    error[i,idx] -=1
    ls_loss += error[i,idx]

result :

error: 
 [[ 0.5488135   0.71518937  0.60276338  0.54488318  0.4236548 ]
 [ 0.64589411  0.43758721  0.891773    0.96366276  0.38344152]]
label: 
 [[ 0.  0.  0.  1.  0.]
 [ 0.  1.  0.  0.  0.]]

error(indexes 3 and 1 are changed): 
[[ 0.5488135   0.71518937  0.60276338 -0.45511682  0.4236548 ]
 [ 0.64589411 -0.56241279  0.891773    0.96366276  0.38344152]]
per_ts_loss: 
 -1.01752960574

Here is the code itself : https://ideone.com/e1k8ra

I get stuck on how to use the result of np.argmax, since the result is a new vector of indexes, and it cant simply be used like :

 error[:, np.argmax(label, axis=1)] -=1

So I'm stuck here!

Upvotes: 0

Views: 79

Answers (2)

hpaulj
hpaulj

Reputation: 231335

Replace:

error[:, np.argmax(label, axis=1)] -=1

with:

error[np.arange(error.shape[0]), np.argmax(label, axis=1)] -=1

and of course

loss = error[np.arange(error.shape[0]), np.argmax(label, axis=1)].sum()

In your example you are changing, and summing, error[0,3] and error[1,1], or in short error[[0,1],[3,1]].

Upvotes: 1

Tomas Giro
Tomas Giro

Reputation: 4267

Maybe this:

import numpy as np


error = np.array([[0.32783139, 0.29204386, 0.0572163 , 0.96162543, 0.8343454 ],
       [0.67308787, 0.27715222, 0.11738748, 0.091061  , 0.51806117]])

label= np.array([[0, 0, 0, 1, 0 ],
           [0, 1, 0, 0, 0]])



def f(error, label):
    per_ts_loss=0
    t=np.zeros(error.shape)
    argma=np.argmax(label, axis=1)
    t[[i for i in range(error.shape[0])],argma]=-1
    print(t)
    error+=t
    per_ts_loss += error[[i for i in range(error.shape[0])],argma]


f(error, label)

Ouput:

[[ 0.  0.  0. -1.  0.]
 [ 0. -1.  0.  0.  0.]]

Upvotes: 0

Related Questions