Whitebeard13
Whitebeard13

Reputation: 431

Python pandas: sort by multiple columns and add a column with the order numbers

Assume the following dataframe:

ID          Date1           Date2       Col1      Col2       DateDiff 
22000118    2019-11-06      2019-11-15  0.562231  0.470641   9 days
22000118    2019-11-06      2019-12-20  0.375000  0.319872   44 days
30          2019-11-06      2019-11-15  0.916047  0.730626   9 days
30          2019-11-06      2019-12-20  0.519936  0.423861   44 days
22000118    2020-11-05      2025-12-19  0.316772  0.301951   1870 days
30          2020-11-05      2026-12-18  0.256964  0.234729   2234 days
30          2020-11-05      2027-12-17  0.250835  0.230236   2598 days
22000118    2019-11-06      2020-01-17  0.330995  0.287567   72 days
22000118    2020-11-05      2026-12-18  0.310234  0.296930   2234 days
22000118    2020-11-05      2027-12-17  0.305502  0.293349   2598 days
30          2019-11-06      2020-01-17  0.443920  0.366206   72 days
30          2020-11-05      2025-12-19  0.264916  0.240628   1870 days

I would like to sort it by the columns ID, Date1 and DateDiff (where Date1 is datetime64 type and DateDelta is timedelta64 type) and add a column with the order numbers just like the following example:

ID          Date1           Date2       Col1      Col2       DateDiff    Order
30          2019-11-06      2019-11-15  0.916047  0.730626   9 days      1
30          2019-11-06      2019-12-20  0.519936  0.423861   44 days     2
30          2019-11-06      2020-01-17  0.443920  0.366206   72 days     3
30          2020-11-05      2025-12-19  0.264916  0.240628   1870 days   1
30          2020-11-05      2026-12-18  0.256964  0.234729   2234 days   2
30          2020-11-05      2027-12-17  0.250835  0.230236   2598 days   3
22000118    2019-11-06      2019-11-15  0.562231  0.470641   9 days      1
22000118    2019-11-06      2019-12-20  0.375000  0.319872   44 days     2
22000118    2019-11-06      2020-01-17  0.330995  0.287567   72 days     3
22000118    2020-11-05      2025-12-19  0.316772  0.301951   1870 days   1
22000118    2020-11-05      2026-12-18  0.310234  0.296930   2234 days   2
22000118    2020-11-05      2027-12-17  0.305502  0.293349   2598 days   3

I am already aware of how to sort it by using sort_values, but I don't know how to add the last column with the order numbers.

Upvotes: 0

Views: 49

Answers (1)

BENY
BENY

Reputation: 323226

After you sort by sort_values

df['Order'] = df.groupby(['ID','Date1']).cumcount()+1

Upvotes: 1

Related Questions