Alfonso_MA
Alfonso_MA

Reputation: 555

Pandas dataframe: comparing every cell with all previous values

I would like to calculate How many previous cells are less than or equal to every cell value in a pandas dataframe.

For example:

I would like to convert this dataframe:

10  100         
20  300         
30  50          
40  25          
50  30          
40  70          
30  100         
60  150         

In to this other dataframe:

0   0
1   1
2   0
3   0
4   1
4   3
3   5
7   6

How can I get it? My dataframe is huge, so performance is a plus.

Upvotes: 0

Views: 217

Answers (2)

user8060120
user8060120

Reputation:

You can use the expanding

df = pd.DataFrame(
    [100, 300, 50, 25, 30, 70, 100, 150],
    columns=['num']
)
# in the x value you have the Series of all previous values
# last(len(x)-1) is current
df.num.expanding().apply(
    lambda x: sum(x[:-1] <= x[len(x)-1]),
    raw=False
).astype(int)

result is:

0    0
1    1
2    0
3    0
4    1
5    3
6    5
7    6
Name: num, dtype: int64

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150745

If your data is not too long, you can try broadcasting:

a = df.to_numpy()
l = len(df)

mask = np.triu(np.ones((l,l), dtype=bool),1)

out = (mask[...,None] *  (a[:,None] <= a[None,...])).sum(0)

Output:

array([[0, 0],
       [1, 1],
       [2, 0],
       [3, 0],
       [4, 1],
       [4, 3],
       [3, 5],
       [7, 6]])

Upvotes: 0

Related Questions