Reputation: 47
I would like to ask a question for a numpy array below.
I have a dataset, which has 50 rows and 15 columns and I created a numpy array as such:
x=x.to_numpy()
I want to compare rows with each other (except than itself), then found the number of rows which satisfies following condition:
there is no other row that
-values are both smaller
-if one is equal, the other one should be smaller
Money Weight
10 80
20 70
30 90
25 50
35 10
40 60
50 10
for instance for row 1: there is no other row which are both smaller on two columns, if one is smaller on the other column row 1 is smaller on the other. Satisfies the condition
for row 3: there is no other row which are both smaller on two columns, it is equal on column weight with row 6 but on money dimension it is smaller. Satisfies the condition
for row 6: there is no other row which are both smaller on two columns. it is equal on weight dimension with row 3 but the value in money is greater. Does not satisfy the condition
I need a number of rows which satisfies the condition in a numpy array.
I have tried bunch of codes bot could not find a proper algorithm. So if anyone has an idea on that I would be appreciated.
Upvotes: 0
Views: 64
Reputation: 61910
IIUC, you can do the following:
mask = (arr <= arr[:, None]).all(2).sum(1) < 2
res = df[mask]
print(res)
Output
Money Weight
0 10 80
1 20 70
3 25 50
4 35 10
Breakdown
# pairwise comparison between rows (elementwise)
comparison = (arr <= arr[:, None])
# reduce to find only rows that have all values lower
lower_values = comparison.all(2)
# count the number of rows with lower values
total_lower = lower_values.sum(1)
# leave only those that include one row (itself)
mask = total_lower <= 1
# filter the original DataFrame
res = df[mask]
print(res)
Upvotes: 1