yusuf
yusuf

Reputation: 3781

Removing columns which has only "nan" values from a NumPy array

I have a NumPy matrix like the one below:

[[182 93 107 ..., nan nan -1]
 [182 93 107 ..., nan nan -1]
 [182 93 110 ..., nan nan -1]
 ..., 
 [188 95 112 ..., nan nan -1]
 [188 97 115 ..., nan nan -1]
 [188 95 112 ..., nan nan -1]]

I want to remove the columns which only involve nan values from the above matrix.

How can I do this? Thanks.

Upvotes: 6

Views: 8794

Answers (1)

xnx
xnx

Reputation: 25478

Assuming your array is of floats now, you can identify all the columns which are NaN and use fancy indexing to retrieve the others:

d
array([[ 182.,   93.,  107.,   nan,   nan,   -1.],
       [ 182.,   93.,  107.,    4.,   nan,   -1.],
       [ 182.,   93.,  110.,   nan,   nan,   -1.],
       [ 188.,   95.,  112.,   nan,   nan,   -1.],
       [ 188.,   97.,  115.,   nan,   nan,   -1.],
       [ 188.,   95.,  112.,   nan,   nan,   -1.]])


d[:,~np.all(np.isnan(d), axis=0)]

array([[ 182.,   93.,  107.,   nan,   -1.],
       [ 182.,   93.,  107.,    4.,   -1.],
       [ 182.,   93.,  110.,   nan,   -1.],
       [ 188.,   95.,  112.,   nan,   -1.],
       [ 188.,   97.,  115.,   nan,   -1.],
       [ 188.,   95.,  112.,   nan,   -1.]])

Upvotes: 9

Related Questions