Lele Canfora
Lele Canfora

Reputation: 77

How to calculate for each row the order of column value by row?

I have a dataframe that looks like this:

- date_x date_y date_z
0 2021-02-19 12:26:48+00:00 2021-02-19 16:15:21.510000+00:00 NaT
1 2021-02-19 16:16:54+00:00 2021-02-10 17:44:01+00:00 2021-02-11 10:10:01+00:00
1 2021-02-21 16:17:54+00:00 2021-02-20 17:44:01+00:00 2021-02-15 10:10:01+00:00

and I want to calculate for each row the order of each column by row:

- date_x date_y date_z
0 0 1 NaN
1 2 0 1
1 2 1 0

To then get the average order of each column for the whole dataframe:

- date_x date_y date_z
0 1,33333333333333 0,666666666666667 0.5

I tried using pandas Series.argsort() but it's the same as Series.sort_values() but instead returning indexes not values. I then thought of iterating each row and looking up the index of the list provided by sort_values but I'm sure there MUST be a faster and cleaner solution?

Thanks

Upvotes: 0

Views: 50

Answers (1)

jezrael
jezrael

Reputation: 863166

You can use DataFrame.rank, subtract 1, get mean and convert Series to one row DataFrame by Series.to_frame and transpose:

df = df.rank(axis=1).sub(1).mean().to_frame().T
print (df)
     date_x    date_y  date_z
0  1.333333  0.666667     0.5

Upvotes: 1

Related Questions