Hariboharry
Hariboharry

Reputation: 37

Python plot forced sorting dates alphabetically instead of chronologically

I am plotting my dataset (Mortality in England and Wales against region) and the dates on the X-axis keep sorting alphabetically. It goes Apr 06, Apr 07,..., Feb 06, Feb 07,..., Sep-13,Sep-14.

I want them to be in chronological order (Like it is in my data set) Is there any way to turn off forced sorting? I am using matplot lib and seaborn for this plot.

Also if anyone knows a way to write out this code without repeating the code 13 times I would be happy to hear it.

My code is as follows

plt.figure(figsize=(48,12))

sns.lineplot(data=Regional,x='Date',y='England and Wales')
sns.lineplot(data=Regional,x='Date',y='England')
sns.lineplot(data=Regional,x='Date',y='North East')
sns.lineplot(data=Regional,x='Date',y='North West')
sns.lineplot(data=Regional,x='Date',y='Yorkshire and the Humber')
sns.lineplot(data=Regional,x='Date',y='East Midlands')
sns.lineplot(data=Regional,x='Date',y='West Midlands')
sns.lineplot(data=Regional,x='Date',y='East of England')
sns.lineplot(data=Regional,x='Date',y='Greater London')
sns.lineplot(data=Regional,x='Date',y='South East')
sns.lineplot(data=Regional,x='Date',y='South West')
sns.lineplot(data=Regional,x='Date',y='Wales')
sns.lineplot(data=Regional,x='Date',y='Non Residents')

plt.legend(['England and Wales','England','North East','North West','Yorkshire and the Humber','East Midlands','West Midlands','East of England','Greater London','South East','South West','Wales','Non Residents'])

Plot

Upvotes: 2

Views: 888

Answers (2)

David Erickson
David Erickson

Reputation: 16683

Instead of melting you can also create a pd.MultiIndex to automatically plot as desired with matplotlib:

Regional['Date'] = pd.to_datetime(Regional['Date'])
Regional = Regional.set_index('Date')
Regional.columns = pd.MultiIndex.from_tuples([('Region', col) for col in Regional.columns])
Regional.plot(ax=ax, title='Daily Mortality Rate by Region', ylabel='Mortality')
plt.legend(title='Regions', labels=[col[1] for col in Regional.columns])

The seaborn way (See other answer) is a bit cleaner, but this is just a matplotlib solution.

Upvotes: 1

busybear
busybear

Reputation: 10590

As mentioned, using pd.melt and datetime format will likely solve your issues. You can use pd.to_datetime to convert your dates to datetime format. Assuming your strings are 'Jul-06' format, you can specify your format is '%b-%y'. Otherwise, you can check this table for the correct format specifier.

pd.melt can reformat your dataframe to plot using a single line of code. Assuming your dataframe contains columns only for date and regions, you can use the following code to pull everything together:

Regional['Date'] = pd.to_datetime(Regional['Date'], format='%b-%y')
Regional = pd.melt(Regional, id_vars=['Date'], var_name='Region', value_name='Mortality')
sns.lineplot(data=Regional, x='Date', y='Mortality', hue='Region')

Upvotes: 2

Related Questions