Alejandro
Alejandro

Reputation: 949

efficient way to make an array unique by column

Assuming we have a two dimensional array like the following:

array1 = np.array([[1,4,3, 64356,5435,434],
                   [11,46,3, 7356,585,74],
                   [51,406,3, 769,5435,24],
                   [12,45,5, 656,135,134],
                   [112,475,5, 656,1385,134],
                   [13,46,  5, 656,1385,19]])

the row 4 and 5 are not unique in terms or their 2,3,4 columns , for which we want to drop one of them. Is there an efficient way to drop rows of an array and make its rows unique in terms of selected columns of it?

Upvotes: 1

Views: 202

Answers (2)

Aaj Kaal
Aaj Kaal

Reputation: 1274

Convert to pandas and back as suggested by S.Mohsen

Code:

import pandas as pd
import numpy as np

array1 = np.array([[1,4,3, 64356,5435,434],
                   [11,46,3, 7356,585,74],
                   [51,406,3, 769,5435,24],
                   [12,45,5, 656,135,134],
                   [112,475,5, 656,1385,134],
                   [13,46,  5, 656,1385,19]])
                   
df = pd.DataFrame(data=array1)
print(df)
df.drop_duplicates(subset=[2,3],inplace=True)
print(df)

array2=df.values
print(array2)

Output:

     0    1  2      3     4    5
0    1    4  3  64356  5435  434
1   11   46  3   7356   585   74
2   51  406  3    769  5435   24
3   12   45  5    656   135  134
4  112  475  5    656  1385  134
5   13   46  5    656  1385   19

    0    1  2      3     4    5
0   1    4  3  64356  5435  434
1  11   46  3   7356   585   74
2  51  406  3    769  5435   24
3  12   45  5    656   135  134

[[    1     4     3 64356  5435   434]
 [   11    46     3  7356   585    74]
 [   51   406     3   769  5435    24]
 [   12    45     5   656   135   134]]

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150735

A solution in pure numpy:

_, idx = np.unique(array1[:,[2,3,4]], axis=0, return_index=True)
array1[sorted(idx)]

Output:

array([[    1,     4,     3, 64356,  5435,   434],
       [   11,    46,     3,  7356,   585,    74],
       [   51,   406,     3,   769,  5435,    24],
       [   12,    45,     5,   656,   135,   134],
       [  112,   475,     5,   656,  1385,   134]])

Upvotes: 2

Related Questions