Reputation: 347
How is it possible to delete all the columns that have the same values in a NumPy
array?
For example if I have this matrix:
[0 1 2 3 1]
[0 2 2 1 0]
[0 4 2 3 4]
[0 1 2 3 4]
[0 1 2 4 5]
I want to get a new matrix that looks like this:
[1 3 1]
[2 1 0]
[4 3 4]
[1 3 4]
[1 4 5]
Upvotes: 2
Views: 2678
Reputation: 22245
Assuming
import numpy
a = numpy.array([[0, 1, 2, 3, 1],
[0, 2, 2, 1, 0],
[0, 4, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 4, 5]])
then
b = a == a[0,:] # compares first row with all others using broadcasting
# b: array([[ True, True, True, True, True],
# [ True, False, True, False, False],
# [ True, False, True, True, False],
# [ True, True, True, True, False],
# [ True, True, True, False, False]], dtype=bool)
using all
along the rows acts as a row-wise and
operation (thanks Divakar!):
c = b.all(axis=0)
# c: array([ True, False, True, False, False], dtype=bool)
which you can use for boolean indexing
a[:, ~c]
Out:
array([[1, 3, 1],
[2, 1, 0],
[4, 3, 4],
[1, 3, 4],
[1, 4, 5]])
As an ugly oneliner:
a[:, ~(a == a[0,:]).all(0)]
Upvotes: 2
Reputation: 215057
You can compare the array with the shifted version of itself, if all pairs are equal for a column, then the column contains only one unique value, which can be removed with boolean indexing:
a[:, ~np.all(a[1:] == a[:-1], axis=0)]
#array([[1, 3, 1],
# [2, 1, 0],
# [4, 3, 4],
# [1, 3, 4],
# [1, 4, 5]])
Upvotes: 5