efficient way to make an array unique by column

Question

Assuming we have a two dimensional array like the following:

array1 = np.array([[1,4,3, 64356,5435,434],
                   [11,46,3, 7356,585,74],
                   [51,406,3, 769,5435,24],
                   [12,45,5, 656,135,134],
                   [112,475,5, 656,1385,134],
                   [13,46,  5, 656,1385,19]])

the row 4 and 5 are not unique in terms or their 2,3,4 columns , for which we want to drop one of them. Is there an efficient way to drop rows of an array and make its rows unique in terms of selected columns of it?

Quang Hoang · Accepted Answer

A solution in pure numpy:

_, idx = np.unique(array1[:,[2,3,4]], axis=0, return_index=True)
array1[sorted(idx)]

Output:

array([[    1,     4,     3, 64356,  5435,   434],
       [   11,    46,     3,  7356,   585,    74],
       [   51,   406,     3,   769,  5435,    24],
       [   12,    45,     5,   656,   135,   134],
       [  112,   475,     5,   656,  1385,   134]])

efficient way to make an array unique by column

Answers (2)

Related Questions