Reputation: 492

If statements with masked arrays in python

I am trying to run a (nested conditions) program using masked arrays. I have used the same function with no masked arrays and works fine. The function looks like this:

import numpy as np
x = np.random.rand(100)
x = x.reshape(4,25)
x = np.ma.masked_less(x,0.2) 
y = np.random.rand(100)
y = y.reshape(4,25)
y = np.ma.masked_less(y,0.2)
z = np.zeros_like(x)

#say that both arrays are masked in the same positions.

for i in range(len(x)):
    for j in range(len(y)):
        if x[i,j] >= y[i,j]:
            if (x[i,j]-y[i,j]) > (z[i,j-1]):
               z[i,j] = 0.
            else:
               z[i,j] = 1.
        else:
            z[i,j] = z[i,j-1] - (x[i,j]-y[i,j])

I do expect to get an array with the same characteristics (i.e.masked as well) of input data (x,y in this case). However, the results I am getting are, or a totally masked array, or one filled with values without mask as I show here:

z = 
masked_array(data =
[[-- -- -- ..., -- -- --]
[-- -- -- ..., -- -- --]
[-- -- -- ..., -- -- --]
..., 
[-- -- -- ..., -- -- --]
[-- -- -- ..., -- -- --]
[-- -- -- ..., -- -- --]],
         mask =
[[ True  True  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]
..., 
[ True  True  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]],
   fill_value = 9.96920996839e+36)

z = 
masked_array(data =
[[9.0 9.0 9.0 ..., 9.0 9.0 9.0]
[9.0 9.0 9.0 ..., 9.0 9.0 9.0]
[9.0 9.0 9.0 ..., 9.0 9.0 9.0]
..., 
[9.0 9.0 9.0 ..., 9.0 9.0 9.0]
[9.0 9.0 9.0 ..., 9.0 9.0 9.0]
[9.0 9.0 9.0 ..., 9.0 9.0 9.0]],
         mask =
[[False False False ..., False False False]
[False False False ..., False False False]
[False False False ..., False False False]
..., 
[False False False ..., False False False]
[False False False ..., False False False]
[False False False ..., False False False]],
   fill_value = 9.96920996839e+36)

when actually I want something like this:

z = 
masked_array(data =
[[9.0 -- -- ..., -- -- --]
[8.7 -- -- ..., -- -- --]
[-- -- -- ..., -- -- --]
..., 
[1.0 -- -- ..., -- -- --]
[-- 3.6 -- ..., -- -- --]
[-- -- -- ..., -- -- --]],
         mask =
[[ False  True  True ...,  True  True  True]
[ False  True  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]
..., 
[ False  True  True ...,  True  True  True]
[ True  False  True ...,  True  True  True]
[ True  True  True ...,  True  True  True]],
   fill_value = 9.96920996839e+36)

I have read the information available about masked arrays, as well as similar questions here, but none of them have a satisfactory explanation on this. I was wondering if it could be possible that conditional statements work in a similar way of numpy.where in the sense that they show only the indices of such conditions?

Upvotes: 1

Answers (3)

hurrdrought

Reputation: 492

I have solved my problem by using an 'external' loop. By this I mean that I used only a single loop in my core function instead of 2 or 3 depending on my data (1D to 3D arrays), i.e.

def func(x,y):
    import numpy as np
    x = np.random.rand(100)
    x = x.reshape(4,25)
    x = np.ma.masked_less(x,0.2) 
    y = np.random.rand(100)
    y = y.reshape(4,25)
    y = np.ma.masked_less(y,0.2)
    z = np.zeros_like(x)

#say that both arrays are masked in the same positions.

    for i in range(len(x)):
        z[i] =  x[i] >= y[i]:
            if (x[i]-y[i]) > (z[i-1]):
               z[i] = 0.
            else:
               z[i] = 1.
        else:
            z[i] = z[i-1] - (x[i]-y[i])

return z

Then, if I have, say a 3D array netCDF file, I apply my function as:

func = np.zeros_like(x)
for i in range(len(x)):
    for j in range(len(y)):
        func[i,j] = func(x[i,j],x[i,j])

Please note that using numpy.where is a good and faster option as well. However, with my short experience in python (and programing in general), I am not taking advantage in the numpy machinery to vectorize my function. For now, my function seems to work as I expected.

Upvotes: 0

pausag

Reputation: 146

There is another way of doing the computations, without masking the arrays. If you want, you still can mask the z array in the end.

    x = np.random.rand(100)
    x = x.reshape(4,25)
    y = np.random.rand(100)
    y = y.reshape(4,25)

    # first if- first if
    idxs_1 = ( x >= y) & ((x-y) > (z-1)) 
    z[idxs_1] = 0

    # second if-else
    idxs_2 = (x>=y) & ((x-y) <= z-1)
    z[idxs_2] = 1

    # final else
    idxs_3 = x < y 
    idxs_3_p = np.hstack((idxs_3[:, 1:], idxs_3[:,0][:,None])) # reshape so that we shift z by one column left        

    z[idxs_3] = z[idxs_3_p] - (x[idxs_3] - y[idxs_3])

You will need to double-check the correctness of the boolean indexing on some test data.

Upvotes: 2

pausag

Reputation: 146

there is probably a typo in the second if-statement. It should be

for i in range(len(x)):
    for j in range(len(y)):
        if x[i,j] >= y[i,j]:
            if (x[i,j]-y[i,j]) > (z[i,j-1]): # not x - y[i,j], but x[i,j]-y[i,j] ??
                # and further down

On my computer, it actually works and produces the expected results, i.e., the masked array with masked/unmasked mixed together:

 In [2]: z
 Out[2]: 
 masked_array(data =
  [[0.34864202355178786 1.0 1.0 1.6118423555903627 -- 0.0 0.0 0.0 0.0 0.0 0.0
   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]
  [0.32457641778594915 1.0 1.0 -- -- 0.0 -- 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]
  [-- 1.0 1.541983077540757 1.0 0.0 0.0 -- 0.0 0.0 0.0 -- 0.0 0.0 -- 0.0 0.0
   0.0 0.0 0.0 0.0 0.0 -- 0.0 0.0 --]
  [-- -- 1.0 -- 0.0 0.0 0.0 0.0 0.0 -- 0.0 -- 0.0 0.0 0.0 -- 0.0 -- 0.0 --
   -- 0.0 0.0 0.0 0.0]],
              mask =
  [[False False False False  True False False False False False False False
   False False False False False False False False False False False False
   False]
  [False False False  True  True False  True False False False False False
   False False False False False False False False False False False False
   False]
  [ True False False False False False  True False False False  True False
   False  True False False False False False False False  True False False
    True]
  [ True  True False  True False False False False False  True False  True
   False False False  True False  True False  True  True False False False
   False]],
        fill_value = 1e+20)

Upvotes: 1

If statements with masked arrays in python

Answers (3)

Related Questions