Reputation: 37
I am plotting my dataset (Mortality in England and Wales against region) and the dates on the X-axis keep sorting alphabetically. It goes Apr 06, Apr 07,..., Feb 06, Feb 07,..., Sep-13,Sep-14.
I want them to be in chronological order (Like it is in my data set) Is there any way to turn off forced sorting? I am using matplot lib and seaborn for this plot.
Also if anyone knows a way to write out this code without repeating the code 13 times I would be happy to hear it.
My code is as follows
plt.figure(figsize=(48,12))
sns.lineplot(data=Regional,x='Date',y='England and Wales')
sns.lineplot(data=Regional,x='Date',y='England')
sns.lineplot(data=Regional,x='Date',y='North East')
sns.lineplot(data=Regional,x='Date',y='North West')
sns.lineplot(data=Regional,x='Date',y='Yorkshire and the Humber')
sns.lineplot(data=Regional,x='Date',y='East Midlands')
sns.lineplot(data=Regional,x='Date',y='West Midlands')
sns.lineplot(data=Regional,x='Date',y='East of England')
sns.lineplot(data=Regional,x='Date',y='Greater London')
sns.lineplot(data=Regional,x='Date',y='South East')
sns.lineplot(data=Regional,x='Date',y='South West')
sns.lineplot(data=Regional,x='Date',y='Wales')
sns.lineplot(data=Regional,x='Date',y='Non Residents')
plt.legend(['England and Wales','England','North East','North West','Yorkshire and the Humber','East Midlands','West Midlands','East of England','Greater London','South East','South West','Wales','Non Residents'])
Upvotes: 2
Views: 888
Reputation: 16683
Instead of melting you can also create a pd.MultiIndex
to automatically plot
as desired with matplotlib
:
Regional['Date'] = pd.to_datetime(Regional['Date'])
Regional = Regional.set_index('Date')
Regional.columns = pd.MultiIndex.from_tuples([('Region', col) for col in Regional.columns])
Regional.plot(ax=ax, title='Daily Mortality Rate by Region', ylabel='Mortality')
plt.legend(title='Regions', labels=[col[1] for col in Regional.columns])
The seaborn
way (See other answer) is a bit cleaner, but this is just a matplotlib
solution.
Upvotes: 1
Reputation: 10590
As mentioned, using pd.melt
and datetime format will likely solve your issues. You can use pd.to_datetime
to convert your dates to datetime format. Assuming your strings are 'Jul-06'
format, you can specify your format is '%b-%y'
. Otherwise, you can check this table for the correct format specifier.
pd.melt
can reformat your dataframe to plot using a single line of code. Assuming your dataframe contains columns only for date and regions, you can use the following code to pull everything together:
Regional['Date'] = pd.to_datetime(Regional['Date'], format='%b-%y')
Regional = pd.melt(Regional, id_vars=['Date'], var_name='Region', value_name='Mortality')
sns.lineplot(data=Regional, x='Date', y='Mortality', hue='Region')
Upvotes: 2