Reputation: 63
The dataframe that I have is:
df = pd.DataFrame(data={'Question':['Q2','Q2','Q1','Q1','Q1','Q3','Q3','Q3'],
'Answer':['Yes','No','$1 to $49','$100 to $200','$50 to $100','More than 5000','Less than 5000','Don't know']})
I would like to sort the dataframe by the column Question
and Answer
. I have created a custom dictionary to be used when sorting by Answer
, so that categorical values can be sorted accordingly.
answer_sort_order = {'$1 to $49': 0, '$50 to $100': 1, '$50 to $99': 2, '$100 to $200': 3,'More than 5000': 4, 'Less than 5000': 5, 'Don't Know': 6}
How can I use this to get the dataframe like below?
I can also specify that to only use the answer_sort_order
dictionary for records in which Question
is Q1
and Q3
Upvotes: 0
Views: 359
Reputation: 260725
You can use the key
parameter of sort_values
:
out = df.sort_values('Answer', key=pd.Series(answer_sort_order).reindex)
or:
out = df.sort_values('Answer', key=lambda x: x.map(answer_sort_order))
output:
Question Answer
2 Q1 $1 to $49
4 Q1 $50 to $100
3 Q1 $100 to $200
5 Q3 More than 5000
6 Q3 Less than 5000
0 Q2 Yes
1 Q2 No
7 Q3 Don't know
Upvotes: 1