Reputation: 998
I want to be able to set the major and minor xticks and their labels for a time series graph plotted from a Pandas time series object.
The Pandas 0.9 "what's new" page says:
"you can either use to_pydatetime or register a converter for the Timestamp type"
but I can't work out how to do that so that I can use the matplotlib ax.xaxis.set_major_locator
and ax.xaxis.set_major_formatter
(and minor) commands.
If I use them without converting the pandas times, the x-axis ticks and labels end up wrong.
By using the 'xticks' parameter, I can pass the major ticks to pandas' .plot
, and then set the major tick labels. I can't work out how to do the minor ticks using this approach (I can set the labels on the default minor ticks set by pandas' .plot
).
Here is my test code:
import pandas as pd
import matplotlib.dates as mdates
import numpy as np
dateIndex = pd.date_range(start='2011-05-01', end='2011-07-01', freq='D')
testSeries = pd.Series(data=np.random.randn(len(dateIndex)), index=dateIndex)
ax = plt.figure(figsize=(7,4), dpi=300).add_subplot(111)
testSeries.plot(ax=ax, style='v-', label='first line')
# using MatPlotLib date time locators and formatters doesn't work with new
# pandas datetime index
ax.xaxis.set_minor_locator(mdates.WeekdayLocator())
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%d\n%a'))
ax.xaxis.grid(True, which="minor")
ax.xaxis.grid(False, which="major")
ax.xaxis.set_major_formatter(mdates.DateFormatter('\n\n\n%b%Y'))
plt.show()
# set the major xticks and labels through pandas
ax2 = plt.figure(figsize=(7,4), dpi=300).add_subplot(111)
xticks = pd.date_range(start='2011-05-01', end='2011-07-01', freq='W-Tue')
testSeries.plot(ax=ax2, style='-v', label='second line', xticks=xticks.to_pydatetime())
ax2.set_xticklabels([x.strftime('%a\n%d\n%h\n%Y') for x in xticks]);
# remove the minor xtick labels set by pandas.plot
ax2.set_xticklabels([], minor=True)
# turn the minor ticks created by pandas.plot off
plt.show()
Update: I've been able to get closer to the layout I wanted by using a loop to build the major xtick labels:
# only show month for first label in month
month = dStart.month - 1
xticklabels = []
for x in xticks:
if month != x.month :
xticklabels.append(x.strftime('%d\n%a\n%h'))
month = x.month
else:
xticklabels.append(x.strftime('%d\n%a'))
However, this is a bit like doing the x-axis using ax.annotate
: possible but not ideal.
How do I set the major and minor ticks when plotting pandas time-series data?
Upvotes: 88
Views: 134416
Reputation: 23081
In matplotlib's plot()
, the default time-series unit is 1 day but in pandas' plot()
, 1 unit is equal to the frequency of the time-series, so if the frequency is 1 day, 1 unit is 1 day; if it is 1 hour, then it is 1 hour etc. This makes the plot()
calls of matplotlib and pandas different when it comes to time-series data.
If the frequency of the time-series is 1-day, then matplotlib.dates.WeekdayLocator
, matplotlib.dates.MonthLocator
etc. can "locate" tick positions1 because 1 day is used as the base unit to make xtick positions by pandas plot()
(coincides with matplotlib's default).
Since pandas' plot()
call returns an Axes object, the tick labels of that Axes object may be modified using matplotlib.dates
.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
idx = pd.date_range('2011-05-01', '2011-07-01', freq='D')
s1 = pd.Series(np.random.randn(len(idx)), index=idx)
ax = s1.plot(style='v-')
ax.xaxis.set(
minor_locator=mdates.WeekdayLocator(), # make minor ticks on each Tuesday
minor_formatter=mdates.DateFormatter('%d\n%a'), # format minor ticks
major_locator=mdates.MonthLocator(), # make major ticks on first day of each month
major_formatter=mdates.DateFormatter('\n\n\n%b\n%Y') # format major ticks
);
However, if the frequency is not 1-day but, say, 1-week, then matplotlib.dates
won't be able to locate the positions because, as mentioned previously, pandas' plot()
sets the unit to be the same as the time-series frequency (1-week), which "confuses" matplotlib.dates
. So if we try to use the same code used to set tick labels of s1
to set the tick labels of s2
, then we would get very wrong ticklabels.
To "solve" the problem, one way is to remove pandas' automatic tick resolution adjustment by passing x_compat=True
. Then major/minor tick labels may be set using matplotlib's resolution; in other words, it may be set in the same way as above.
idx = pd.date_range('2011-05-01', '2011-07-01', freq='W')
s2 = pd.Series(np.random.randn(len(idx)), index=idx)
ax = s2.plot(style='v-', x_compat=True, rot=0)
ax.xaxis.set(
minor_locator=mdates.WeekdayLocator(), # make minor ticks on each Tuesday
minor_formatter=mdates.DateFormatter('%d'), # format minor ticks
major_locator=mdates.MonthLocator(), # make major ticks on first day of each month
major_formatter=mdates.DateFormatter('\n\n%b\n%Y') # format major ticks
);
Another way to get around the issue is to use matplotlib's plot()
instead (as suggested by @bmu). Because the unit is fixed in matplotlib, we can set the tick labels as above without issue.
plt.plot(s2.index, s2, 'v-') # use matplotlib instead
plt.gca().xaxis.set(
minor_locator=mdates.WeekdayLocator(byweekday=0), # make minor ticks on each Monday
minor_formatter=mdates.DateFormatter('%d'), # format minor ticks
major_locator=mdates.MonthLocator(), # make major ticks on first day of each month
major_formatter=mdates.DateFormatter('\n\n%b\n%Y') # format major ticks
);
1 matplotlib.dates.num2timedelta(1) == datetime.timedelta(days=1)
is True.
Upvotes: 2
Reputation: 3311
To turn off Pandas Datetime tick adjustment, you have to add the argument x_compat=True
Example:
ds.plot(x_compat=True)
See more examples in the Pandas documentation: Suppressing tick resolution adjustment
Upvotes: 4
Reputation: 36184
Both pandas
and matplotlib.dates
use matplotlib.units
for locating the ticks.
But while matplotlib.dates
has convenient ways to set the ticks manually, pandas seems to have the focus on auto formatting so far (you can have a look at the code for date conversion and formatting in pandas).
So for the moment it seems more reasonable to use matplotlib.dates
(as mentioned by @BrenBarn in his comment).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
idx = pd.date_range('2011-05-01', '2011-07-01')
s = pd.Series(np.random.randn(len(idx)), index=idx)
fig, ax = plt.subplots()
ax.plot_date(idx.to_pydatetime(), s, 'v-')
ax.xaxis.set_minor_locator(dates.WeekdayLocator(byweekday=(1),
interval=1))
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n%a'))
ax.xaxis.grid(True, which="minor")
ax.yaxis.grid()
ax.xaxis.set_major_locator(dates.MonthLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('\n\n\n%b\n%Y'))
plt.tight_layout()
plt.show()
(my locale is German, so that Tuesday [Tue] becomes Dienstag [Di])
Upvotes: 94