PUJA
PUJA

Reputation: 649

Handling masked numpy array

I have masked numpy array. While doing processing for each of the element, I need to first check whether the particular element is masked or not, if masked then I need to skip those element.

I have tried like this :

from netCDF4 import Dataset

data=Dataset('test.nc')
dim_size=len(data.dimensions[nc_dims[0]])
model_dry_tropo_corr=data.variables['model_dry_tropo_corr'][:]
solid_earth_tide=data.variables['solid_earth_tide'][:]

for i in range(0,dim_size)
    try :
        model_dry_tropo_corr[i].mask=True
       continue

    except :
        Pass

    try:
         solid_earth_tide[i].mask=True
         continue
    except:
         Pass

     correction=model_dry_tropo_corr[i]/2+solid_earth_tide[i]

Is there other efficient way to do this, please do let me know. Your suggestion or comments are highly appreciated.

Upvotes: 0

Views: 2000

Answers (2)

hpaulj
hpaulj

Reputation: 231385

I'm puzzled about this code

try :
    model_dry_tropo_corr[i].mask=True
   continue

except :
    Pass

I don't have netCDF4 installed, but it appears from the documentation that your variable will look like, maybe even be a numpy.ma masked array.

It would be helpful if you printed all or part of this variable, with attributes like shape and dtype.

I can make a masked array with an expression like:

In [746]: M=np.ma.masked_where(np.arange(10)%3==0,np.arange(10))

In [747]: M
Out[747]: 
masked_array(data = [-- 1 2 -- 4 5 -- 7 8 --],
             mask = [ True False False  True False False  True False False  True],
       fill_value = 999999)

I can test whether mask for a given element if True/False with:

In [748]: M.mask[2]
Out[748]: False

In [749]: M.mask[3]
Out[749]: True

But if I index first,

In [754]: M[2]
Out[754]: 2

In [755]: M[3]
Out[755]: masked

In [756]: M[2].mask=True
...
AttributeError: 'numpy.int32' object has no attribute 'mask'

In [757]: M[3].mask=True

So yes, your try/except will skip the elements that have the mask set True.

But I think it would be clear to do:

 if model_dry_tropo_corr.mask[i]:
     continue

But that is still iterative.

But as @user3404344 showed, you could perform the math with the variables. Masking will carry over. That could though be a problem if masked values are 'bad' and cause errors in the calculation.

If I define another masked array

In [764]: N=np.ma.masked_where(np.arange(10)%4==0,np.arange(10))

In [765]: N+M
Out[765]: 
masked_array(data = [-- 2 4 -- -- 10 -- 14 -- --],
             mask = [ True False False  True  True False  True False  True  True],
       fill_value = 999999)

you can see how elements that were masked in either M or N are masked in the result

I can used the compressed method to give only the valid elements

In [766]: (N+M).compressed()
Out[766]: array([ 2,  4, 10, 14])

filling can also be handy when doing math with masked arrays:

In [779]: N.filled(0)+M.filled(0)
Out[779]: array([ 0,  2,  4,  3,  4, 10,  6, 14,  8,  9])

I could use filled to neutralize problem calculations, and still mask those values

In [785]: z=np.ma.masked_array(N.filled(0)+M.filled(0),mask=N.mask|M.mask)

In [786]: z
Out[786]: 
masked_array(data = [-- 2 4 -- -- 10 -- 14 -- --],
             mask = [ True False False  True  True False  True False  True  True],
       fill_value = 999999)

Oops, I don't need to worry about the masked values messing the calculation. The masked addition is doing the filling for me

In [787]: (N+M).data   
Out[787]: array([ 0,  2,  4,  3,  4, 10,  6, 14,  8,  9])

In [788]: N.data+M.data    # raw unmasked addition
Out[788]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [789]: z.data     # same as the (N+M).data
Out[789]: array([ 0,  2,  4,  3,  4, 10,  6, 14,  8,  9])

Upvotes: 0

user3404344
user3404344

Reputation: 1727

Instead of a loop you could use

correction = model_dry_tropo_corr/2 + solid_earth_tide

This will create a new masked array that will have your answers and masks. You could then access unmasked values from new array.

Upvotes: 1

Related Questions