Fiori
Fiori

Reputation: 311

matplotlib DateFormatter not showing correct dates with yyyy-mm-dd column

I have a python dataframe with two columns, a numeric column (total_outbounds) on the y-axis and a date column (month, pardon the bad name) for x-axis:

enter image description here

and when when I run this code to create a graph using this dataframe:

fig,ax = plt.subplots()

my_df.plot(x='month', y='total_outbounds', ax=ax, label = 'Total Email Outbounds on LE Change')
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%y'))
plt.xlabel('')
plt.title('Total LE Changes and Outbounds by Month', pad = 10)

I receive a graph where the X-axis is not what I was hoping for... Am I using mdates.DateFormatter wrong? Looking to receive mm/yy on the X-Axis, instead of the Apr, Jul, etc. that are currently appearing.

enter image description here

For reproducibility, here is the dataframe output with my_df.to_dict()

{'month': {0: Timestamp('2020-01-01 00:00:00'),
  1: Timestamp('2020-02-01 00:00:00'),
  2: Timestamp('2020-03-01 00:00:00'),
  3: Timestamp('2020-04-01 00:00:00'),
  4: Timestamp('2020-05-01 00:00:00'),
  5: Timestamp('2020-06-01 00:00:00'),
  6: Timestamp('2020-07-01 00:00:00'),
  7: Timestamp('2020-08-01 00:00:00'),
  8: Timestamp('2020-09-01 00:00:00'),
  9: Timestamp('2020-10-01 00:00:00'),
  10: Timestamp('2020-11-01 00:00:00'),
  11: Timestamp('2020-12-01 00:00:00'),
  12: Timestamp('2021-01-01 00:00:00'),
  13: Timestamp('2021-02-01 00:00:00'),
  14: Timestamp('2021-03-01 00:00:00')},
 'total_outbounds': {0: 26364,
  1: 33081,
  2: 35517,
  3: 34975,
  4: 40794,
  5: 51659,
  6: 50948,
  7: 65332,
  8: 82839,
  9: 96408,
  10: 86923,
  11: 99176,
  12: 122199,
  13: 116057,
  14: 108439}}

and I think you should be able to use pd.DataFrame.from_dict() to turn that back into a dataframe my_df from the dictionary. Please let me know if there's a more reproducible way to share the dataframe.

Edit: the solution in the comments works, however now I cannot rotate the minor ticks using plt.xaxis(rotation=50), this only rotates the two major ticks... also the X-axis values appearing are odd (showing 71 as the year?)

enter image description here

Upvotes: 4

Views: 3799

Answers (2)

riverflow
riverflow

Reputation: 9

I also had this issue. Finally fixed it and hope my little experience can help you here, and hope this can be more clear.

The reason causing this is that your 'dates' inside your [month] column are strings, not Datetime. Then the Dateformatter don't recognise the 'dates' in the way you want.

So what you need to do first is to transform the format of the 'dates' in your [month] column into Datetime objects. To do this, simply use:

df['month'] = pd.to_datetime(df['month'])

(please be aware that this line of code may be stoped if you set xlim using Matplotlib 3.4.2, but works fine using Matplotlib 3.3.0 )

then use this transformed df['month'] as x-axis to plot (for exp):

 fig, ax = plt.subplots(figsize=(12, 12))

 ax.plot(
    df['month'],
    df['total_outbounds']
    )

Then you can add the formatting (for exp):

ax.xaxis.set_major_formatter(mdates.DateFormatter("%b, %Y"))

Then everything just worked out, at lest on my Mac.

Upvotes: 0

tdy
tdy

Reputation: 41327

As discussed in the comments, the Apr/Jul/Oct are minor ticks.

However, rather than customizing both major/minor ticks, I suggest increasing the major tick frequency, disabling minor ticks, and using autofmt_xdate() to style the date ticks:

fig, ax = plt.subplots()
ax.plot(df.month, df.total_outbounds, label='Total Email Outbounds on LE Change')
ax.legend()

# increase the major tick frequency (8 ticks in this example)
start, end = ax.get_xlim()
xticks = np.linspace(start, end, 8)
ax.set_xticks(xticks)
ax.set_xticklabels(xticks)

# set date format
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%y'))

# use matplotlib's auto date styling
fig.autofmt_xdate()

# disable minor ticks
plt.minorticks_off()

manual date ticks

Upvotes: 4

Related Questions