Reputation: 3084
I have 2-D data containing bad values (0 indicates bad). My goal is to replace each bad value with its nearest neighbor that isn't bad.
SciPy's NearestNDInterpolator
seems like a nice way to do this. In the 2-D case, it accepts a (number of points) x 2 array of indices and a (number of points) x 1 array of corresponding values to interpolate from.
So, I need to get a subset of the indices and values: those that are "good." The code below achieves this, but coordinates = array(list(ndindex(n_y, n_x)))
and where(values != 0)[0]
are both messy. Is there a cleaner way to do this?
# n_y and n_x are the number of points along each dimension.
coordinates = array(list(ndindex(n_y, n_x)))
values = data.flatten()
nonzero_ind = where(values != 0)[0]
nonzero_coordinates = coordinates[nonzero_ind, :]
nonzero_values = values[nonzero_ind]
Thanks.
Upvotes: 3
Views: 185
Reputation: 46578
Another way to approach this problem altogether is to just run your array through an image filter that would automatically 'close' those holes. There is such a filter in scipy.ndimage
called grey_closing
:
>>> from scipy import ndimage
>>> a = np.arange(1,26).reshape(5,5)
>>> a[2,2] = 0
>>> a
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 0, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
>>> a = np.arange(1,26).reshape(5,5)
>>> ndimage.grey_closing(a, size=2)
array([[ 7, 7, 8, 9, 10],
[ 7, 7, 8, 9, 10],
[12, 12, 13, 14, 15],
[17, 17, 18, 19, 20],
[22, 22, 23, 24, 25]])
But this has unfortunate edge affects (which you can change a bit with the mode
paramenter). To avoid this, you could just take the new values from where the original array was 0 and place them into the original array:
>>> np.where(a, a, ndimage.grey_closing(a, size=2))
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 12, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
Alternatively, you could use scikit-image:
>>> from skimage.morphology import closing, square
>>> a = np.arange(1,10, dtype=np.uint8).reshape(3,3)
>>> a[1,1] = 0
>>> a
array([[1, 2, 3],
[4, 0, 6],
[7, 8, 9]], dtype=uint8)
>>> closing(a, square(2))
array([[1, 2, 3],
[4, 4, 6],
[7, 8, 9]], dtype=uint8)
>>> a
array([[1, 2, 3],
[4, 0, 6],
[7, 8, 9]], dtype=uint8)
Give it a
as the output array and the closing is done in-place:
>>> closing(a, square(2), a)
>>> a
array([[1, 2, 3],
[4, 4, 6],
[7, 8, 9]], dtype=uint8)
Use a larger square
(or any shape from skimage.morphology) if you have big gaps of zeros. The disadvantage of this (aside from the dependency) is that it seems to only work for uint8
.
Upvotes: 1
Reputation: 8712
nonzero_coordinates = np.argwhere(data != 0)
nonzero_values = np.extract(data, data)
or simply:
nonzero_values = data[data!=0]
I initially rather missed the obvious nonzero_values
method, but thanks to @askewchan in the comments for that.
Upvotes: 4
Reputation: 88238
So, I need to get a subset of the indices and values: those that are "good."
If you've created a "mask" of the bad indices, you can take the negation of that mask ~
and then find the indices from the mask using np.where
. For example:
import numpy as np
# Sample array
Z = np.random.random(size=(5,5))
# Use whatever criteria you have to mark the bad indices
bad_mask = Z<.2
good_mask = ~bad_mask
good_idx = np.where(good_mask)
print good_mask
print good_idx
Gives, as an example:
[[ True True True True False]
[ True False False True True]
[ True False True True True]
[ True True True True True]
[ True True True True True]]
(array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4]), array([0, 1, 2, 3, 0, 3, 4, 0, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4]))
Upvotes: 1