Reputation: 177
I'm loading these packages:
import pandas as pd
from matplotlib import pyplot as plt
import numpy
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib
import matplotlib.dates as mdates
sns.set()
%matplotlib inline
And I have a dataframe df
which looks like this
df['element_date'] = pd.to_datetime(df['element_date'])
df['mdate'] = [mdates.date2num(d) for d in df['element_date']]
df.head()
id Tier element element_date mdate
5228039 Tier B 4 2018-05-28 10:59:00 736842.457639
5232263 Tier B 3 2018-05-28 10:59:00 736842.457639
5245478 Tier B EA 2018-05-27 13:58:00 736841.581944
4975552 Tier B 2 2018-05-30 21:01:00 736844.875694
4975563 Tier A 2 2018-05-30 21:01:00 736844.875694
I'm trying to set the x axis of a count-plot to month and day only, and I'm getting an error message. This is the code I'm running (I've removed the naming labels to save space):
fig, ax = plt.subplots(figsize=(15,10))
fig = sns.countplot(x="mdate", hue="element", data=df)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m-%d'))
plt.show(fig)
I'm getting DateFormatter found a value of x=0, which is an illegal date. This usually occurs because you have not informed the axis that it is plotting dates, e.g., with ax.xaxis_date()
Now, of course I've tried adding ax.xaxis_date()
, to no avail. I also have no x values that are equal to 0. I've dropped NA, and value counted mdate, and there is no 0 to be found.
I've looked at a bunch of different answers here, and can't seem to get to a solution. I've tried both using element_date
as my date time value, as well as using "mathplotlib" dates using mdate
.
Any thoughts would be much appreciated. Essentially, I'm just trying to have my x-axis be an ordered series of dates over two months, with elements being counted for each date.
Thanks!
Upvotes: 2
Views: 5689
Reputation: 107707
Buried down on a GitHub pandas issues page, user, @pawaller, found a workaround using plt.FixedFormatter
where you string format the datetime dataframe column.
ax.xaxis.set_major_formatter(plt.FixedFormatter(df['element_date'].dt.strftime("%m-%d")))
However, using above does not immediately work as value labels are out of order and not aligned properly. Hence, unique()
and sort_values()
are required:
x_dates = df['element_date'].dt.strftime('%m-%d').sort_values().unique()
ax.xaxis.set_major_formatter(plt.FixedFormatter(x_dates))
To demonstrate below (where mdate column is never used):
Data
from io import StringIO
...
txt = '''id Tier element element_date mdate
5228039 "Tier B" 4 "2018-05-28 10:59:00" 736842.457639
5232263 "Tier B" 3 "2018-05-28 10:59:00" 736842.457639
5245478 "Tier B" EA "2018-05-27 13:58:00" 736841.581944
4975552 "Tier B" 2 "2018-05-30 21:01:00" 736844.875694
4975563 "Tier A" 2 "2018-05-30 21:01:00" 736844.875694'''
df = pd.read_table(StringIO(txt), sep="\s+", parse_dates=[3])
Plot
fig, ax = plt.subplots(figsize=(13,4))
fig = sns.countplot(x="element_date", hue="element", data=df, ax=ax)
x_dates = df['element_date'].dt.strftime('%m-%d').sort_values().unique()
ax.xaxis.set_major_formatter(plt.FixedFormatter(x_dates))
plt.legend(loc='upper left')
plt.show()
plt.close()
Upvotes: 5