Reputation: 1134
I have a multidimensional array. Example (in 2D):
x = np.array([[ 1., 1., np.nan, np.nan],
[ 2., np.nan, 2., np.nan],
[ np.nan, 3., np.nan, np.nan]])
Is there an easy, efficient way to "compress" / "squeeze" / "push" the nans out of it, along an axis? I mean, so that the output (here: axis=0) would become:
np.array([[ 1., 1., np.nan, np.nan],
[ 2., 3., 2., np.nan]])
Should also work with more than 2 dimensions.
Upvotes: 1
Views: 1650
Reputation: 53029
You can use argsort
on the mask of non-nan
elements; use a stable sort algorithm (like mergesort) to preserve the original order of the non-nan
elements:
mask = np.isnan(x)
cut = np.min(np.count_nonzero(mask, axis=0))
x[np.argsort(~mask, axis=0, kind='mergesort')[cut:], np.arange(x.shape[1])]
Output:
array([[ 1., 1., nan, nan],
[ 2., 3., 2., nan]])
ND-version:
import numpy as np
def nan_bouncer(x, axis=0):
if axis != 0:
x = np.moveaxis(x, axis, 0)
mask = np.isnan(x)
cut = np.min(np.count_nonzero(mask, axis=0))
idx = tuple(np.ogrid[tuple(map(slice, x.shape[1:]))])
res = x[(np.argsort(~mask, axis=0, kind='mergesort')[cut:],) + idx]
return res if axis == 0 else np.moveaxis(res, 0, axis)
#demo
data = np.random.randint(0, 3, (3, 4, 4)).astype(float)
data /= data / data
print(data)
print(nan_bouncer(data))
print(nan_bouncer(data, 2))
Sample output:
[[[ nan 1. 2. 1.]
[ 2. nan nan 2.]
[ 2. 1. 1. 2.]
[ 1. 1. 2. nan]]
[[ nan nan 2. 1.]
[ 2. 2. nan 1.]
[ 2. 2. 2. 2.]
[ 2. 2. nan 1.]]
[[ 1. 1. nan nan]
[ 1. 1. 2. 1.]
[ 2. nan 2. 1.]
[ 1. 1. 1. 2.]]]
[[[ nan nan nan nan]
[ 2. nan nan 2.]
[ 2. nan 1. 2.]
[ 1. 1. nan nan]]
[[ nan 1. 2. 1.]
[ 2. 2. nan 1.]
[ 2. 1. 2. 2.]
[ 2. 2. 2. 1.]]
[[ 1. 1. 2. 1.]
[ 1. 1. 2. 1.]
[ 2. 2. 2. 1.]
[ 1. 1. 1. 2.]]]
[[[ nan 1. 2. 1.]
[ nan nan 2. 2.]
[ 2. 1. 1. 2.]
[ nan 1. 1. 2.]]
[[ nan nan 2. 1.]
[ nan 2. 2. 1.]
[ 2. 2. 2. 2.]
[ nan 2. 2. 1.]]
[[ nan nan 1. 1.]
[ 1. 1. 2. 1.]
[ nan 2. 2. 1.]
[ 1. 1. 1. 2.]]]
Upvotes: 4