Reputation: 386
I have a large 2D ndarray of floats, call it ar
. It contains some NaNs. I am interested in the immediate neighbors of the NaNs to the right (eg. along axis=1
). For example, if I know that say point (3, 7) is a NaN, I want to select ar[3, 8:8+N]
. I then want to repeat for all locations of the NaNs, and vstack
all the slices thus obtained.
I can locate the NaNs with np.where
happily, and do a for loop over the values. Sadly, that's a bit slow. Is there an efficient way to do the indexing in a vectorised fashion? So I have a list of tuples (x, y)
, and I want to get more-or-less,
result=np.vstack([ ar[x, y+1:y+1+N] for x, y, in tuples ])
just without the looping. Is that possible?
Many thanks in advance.
Upvotes: 2
Views: 71
Reputation: 67447
What you are asking for is ill defined if a nan happens less than N
columns from the edge, but the following should work:
rows, cols = np.where(np.isnan(ar))
cols = (cols[:, None] + np.arange(1, N+1)).reshape(-1)
# Handle indices out of range by repeating the last column
cols = np.clip(cols, 0, ar.shape[1] - 1)
rows = np.repeat(rows, N)
result = ar[rows, cols].reshape(-1, 2)
Making up some fake data:
>>> ar = np.random.rand(25)
>>> ar[np.random.randint(25, size=5)] = np.nan
>>> ar = ar.reshape(5, 5)
>>> N = 2
and running the above code on it yields:
>>> ar
array([[ 0.96556647, nan, 0.02934316, 0.82174232, 0.29293098],
[ 0.34819313, 0.57449136, nan, nan, 0.32791866],
[ 0.14020414, 0.60668458, 0.95613773, 0.09533064, 0.43401037],
[ 0.83888255, 0.34240687, nan, 0.02495232, 0.36234979],
[ 0.21870906, 0.24181006, 0.81447603, 0.24216213, nan]])
>>> result
array([[ 0.02934316, 0.82174232],
[ nan, 0.32791866],
[ 0.32791866, 0.32791866],
[ 0.02495232, 0.36234979],
[ nan, nan]])
Upvotes: 1