astromax
astromax

Reputation: 6331

Masked Array Calculation on np.array

Suppose, for example, I have a Numpy nd.array which has the shape (10,10):

import numpy as np
a = np.linspace(-1,1,100).reshape(10,10)

I'd like to perform a calculation on the first element of each row if and only if the first element is smaller than zero. To do this, I've been thinking of using a masked array:

a = np.ma.MaskedArray(a,mask=(np.ones_like(a)*(a[:,0]<0)).T)

>>> (np.ones_like(a)*(a[:,0]<0)).T
array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

This will allow me to perform calculations only on the rows in which the first element is less than zero (it just so happens that in this example the other elements of these rows are also negative, but I've tested the case where only the first elements are negative and the others are positive). I have a few questions at this point:

1) Should I add an additional mask to cover up all columns except the first to perform my calculation (to make the example concrete: I'd like to add 1000 to the first element of each row where that element is less than zero) ?

2) Is masking an array permanent? Is there an unmask method?

3) Is this the easiest way to perform this type of calculation?

Any suggestions would be appreciated. Thanks!

Upvotes: 0

Views: 204

Answers (2)

John Greenall
John Greenall

Reputation: 1690

In my opinion using a masked array seems a bit overkill for doing something relatively simple like this. I would use fancy indexing of numpy to do it:

#get indices of rows to update
rowsToUpdate = np.nonzero(a[:,0]<0)[0]
#increment first element of target rows by 1000
a[rowsToUpdate,0] += 1000

Upvotes: 1

Phillip Cloud
Phillip Cloud

Reputation: 25652

You could do the following using pandas:

import numpy as np
from pandas import DataFrame  # DataFrame is the workhorse of pandas

a = DataFrame(np.linspace(-1, 1, 100).reshape(10, 10))
mask = a[0] < 0 # a[0] is the 0th column of a
suba = a[mask]

# do some calcs with suba ... make sure the index remains the same

a[mask] = suba

Upvotes: 1

Related Questions