Reputation: 105
The data looks like this:
0 Thursday
1 Thursday
2 Thursday
3 Thursday
etc, etc
My code:
import pandas as pd
data_file = pd.read_csv('./data/Chicago-2016-Summary.csv')
days = data_file['day_of_week']
order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
sorted(days, key=lambda x: order.index(x[0]))
print(days)
This results in error:
ValueError: 'T' is not in list
I tried to sort and get this error but I have no idea what this means.
I just want to sort the data Monday-Sunday so I can do some visualizations. Any suggestions?
Upvotes: 2
Views: 441
Reputation: 40878
You can use pandas' Categorical
data type for this:
order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
data_file['day_of_week'] = pd.Categorical(data_file['day_of_week'], categories=order, ordered=True)
data_file.sort_values(by='day_of_week', inplace=True)
In your example, be aware that when you specify
days = data_file['day_of_week']
you are creating a view to that column (Series) within your data_file
frame. You may want to use days = data_file['day_of_week'].copy()
. Or, just work within the DataFrame as is done above.
Upvotes: 3