Efficent way of constructing a matrix with all elements zero except one in numpy

I want to compute the output error for a neural network for each input by compare output signal and its true output value so I need two matrix to compute this task.

I have output matrix in shape of (n*1) but in the label I just have the index of neuron that should be activated, so I need a matrix in the same shape with all element equal to zero except the one which it's index is equal to the label. I could do that with a function but I wonder is there a built in method in numpy python that can do that for me?

Upvotes: 6

Answers (3)

LomaxOnTheRun

Reputation: 702

One liner:

x = np.identity(n)[id]

Upvotes: 4

umutto

Reputation: 7700

You can do that multiple ways using numpy or standard libraries, one way is to create an array of zeros, and set the value corresponding to index as 1.

n = len(result)

a = np.zeros((n,)); 
a[id] = 1

It probably is going to be the fastest one as well:

>> %timeit a = np.zeros((n,)); a[id] = 1
1000000 loops, best of 3: 634 ns per loop

Alternatively you can use numpy.pad to pad [ 1 ] array with zeros. But this will almost definitely will be slower due to padding logic.

np.lib.pad([1],(id,n-id),'constant', constant_values=(0))

As expected order of magnitude slower:

>> %timeit np.lib.pad([1],(id,n-id),'constant', constant_values=(0))
10000 loops, best of 3: 47.4 µs per loop

And you can try list comprehension as suggested by the comments:

results = [7]

np.matrix([1 if x == id else 0 for x in results])

But it is much slower than the first method as well:

>> %timeit np.matrix([1 if x == id else 0 for x in results])
100000 loops, best of 3: 7.25 µs per loop

Edit: But in my opinion, if you want to compute the neural networks error. You should just use np.argmax and compute whether it was successful or not. That error calculation may give you more noise than it is useful. You can make a confusion matrix if you feel your network is prone to similarities.

Upvotes: 5

Daniel F

Reputation: 14399

A few other methods that also seem to be slower than @umutto's above:

%timeit a = np.zeros((n,)); a[id] = 1 #umutto's method
The slowest run took 45.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.53 µs per loop

Boolean construction:

%timeit a = np.arange(n) == id
The slowest run took 13.98 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.76 µs per loop

Boolean construction to integer:

%timeit a = (np.arange(n) == id).astype(int)
The slowest run took 15.31 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.47 µs per loop

List construction:

%timeit a = [0]*n; a[id] = 1; a=np.asarray(a)
10000 loops, best of 3: 77.3 µs per loop

Using scipy.sparse

%timeit a = sparse.coo_matrix(([1], ([id],[0])), shape=(n,1))
10000 loops, best of 3: 51.1 µs per loop

Now what's actually faster may depend on what's being cached, but it seems like constructing the zero array is probably fastest, especially if you can use np.zeros_like(result) instead of np.zeros(len(result))

Upvotes: 4

Efficent way of constructing a matrix with all elements zero except one in numpy

Answers (3)

Related Questions