Reputation: 2219
I have a dataframe df
that is grouped by ID
. For each group, there is one row that has a flag that identifies that its the first instance 'First' == 1
.
I ultimately want to sort each group by admit date, however, I need the row with 'First' == 1
to be the first row of that group regardless of its date. Then I want to sort the remaining rows based on admit date.
Sample df
:
ID admit discharge discharge_location first
20 3-4-2018 3-6-2018 Home 1
20 2-2-2018 2-6-2018 Home 0
20 2-5-2018 2-23-2018 Home 0
30 1-2-2018 2-3-2018 Home 0
30 1-15-2018 1-18-2018 Home 1
30 1-20-2018 1-24-2018 Home 0
expected df
:
ID admit discharge discharge_location first
20 3-4-2018 3-6-2018 Home 1
20 2-2-2018 2-6-2018 Home 0
20 2-5-2018 2-23-2018 Home 0
30 1-15-2018 1-18-2018 Home 1
30 1-2-2018 2-3-2018 Home 0
30 1-20-2018 1-24-2018 Home 0
My approach does not account for the 'first' column being first within the group.
df.sort_values(by=['ID','admit'], inplace=True)
This has been stumping me all day.
Upvotes: 2
Views: 1107
Reputation: 19395
here it is bro,
df.sort_values(by=['ID', 'first', 'admit'],
ascending = [True, False, True],
inplace = True)
Upvotes: 5