Dinc Kirikci
Dinc Kirikci

Reputation: 47

Numpy Array Comparison (Python)

I would like to ask a question for a numpy array below.

I have a dataset, which has 50 rows and 15 columns and I created a numpy array as such:

x=x.to_numpy()

I want to compare rows with each other (except than itself), then found the number of rows which satisfies following condition:

there is no other row that

-values are both smaller

-if one is equal, the other one should be smaller

Money Weight
10 80
20 70
30 90
25 50
35 10
40 60
50 10

for instance for row 1: there is no other row which are both smaller on two columns, if one is smaller on the other column row 1 is smaller on the other. Satisfies the condition

for row 3: there is no other row which are both smaller on two columns, it is equal on column weight with row 6 but on money dimension it is smaller. Satisfies the condition

for row 6: there is no other row which are both smaller on two columns. it is equal on weight dimension with row 3 but the value in money is greater. Does not satisfy the condition

I need a number of rows which satisfies the condition in a numpy array.

I have tried bunch of codes bot could not find a proper algorithm. So if anyone has an idea on that I would be appreciated.

Upvotes: 0

Views: 64

Answers (1)

Dani Mesejo
Dani Mesejo

Reputation: 61910

IIUC, you can do the following:

mask = (arr <= arr[:, None]).all(2).sum(1) < 2
res = df[mask]
print(res)

Output

   Money  Weight
0     10      80
1     20      70
3     25      50
4     35      10

Breakdown

# pairwise comparison between rows (elementwise)
comparison = (arr <= arr[:, None])

# reduce to find only rows that have all values lower
lower_values = comparison.all(2)

# count the number of rows with lower values
total_lower = lower_values.sum(1)

# leave only those that include one row (itself)
mask = total_lower <= 1

# filter the original DataFrame
res = df[mask]

print(res)

Upvotes: 1

Related Questions