Reputation: 1749
I am using matplotlib's imshow()
function to show a pandas.DataFrame
.
I would like the labels and ticks for both x and y axes to be drawn from the DataFrame.index and DataFrame.columns lists, but I can't figure out how to do it.
Assuming that data
is a pandas.DataFrame
:
>>> print data
<class 'pandas.core.frame.DataFrame'>
Index: 201 entries, 1901 to 2101
Data columns:
jan 201 non-null values
feb 201 non-null values
mar 201 non-null values
apr 201 non-null values
may 201 non-null values
jun 201 non-null values
jul 201 non-null values
aug 201 non-null values
sep 201 non-null values
oct 201 non-null values
nov 201 non-null values
dec 201 non-null values
When I do this:
ax1 = fig.add_subplot(131, xticklabels=data.columns, yticklabels=data.index)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
im1 = ax1.imshow(data,
interpolation='nearest',
aspect='auto',
cmap=cmap )
I end up with nicely spaced tick labels on the y axis of the image, but the labels are 1901-1906 instead of 1901 thru 2101. Likewise, the x axis tick labels are feb-jul instead of jan-dec.
If I use
ax1 = fig.add_subplot(131) # without specifying tick labels
Then I end up with the axis tick labels simply being the underlying ndarray index values (i.e. 0-201 and 0-12). I don't need to modify the spacing or quantity of ticks and labels, I just want the label text to come from the DataFrame index and column lists. Not sure if I am missing something easy or not?
Thanks in advance.
Upvotes: 9
Views: 16243
Reputation: 2345
As a general solution, I have found the following method to be an easy way to bring a Pandas datetime64 index into a matplotlib axis label.
First, create a new series by converting the pandas datetime64 index to a Python datetime.datetime class.
new_series = your_pandas_dataframe.index.to_pydatetime()
Now you have all the functionality of matplotlib.dates. Before plotting, import matplotlib.dates as mdates and declare the following variables:
years = mdates.YearLocator()
months = mdates.MonthLocator()
days = mdates.DayLocator()
hours = mdates.HourLocator(12) #if you want ticks every 12 hrs, you can pass 12 to this function
minutes = mdates.MinuteLocator()
daysFmt = mdates.DateFormatter('%m/%d') #or whatever format you want
Now, make your plots, using the new_series as the x-axis:
fig1 = plt.figure()
ax = fig1.add_subplot(111)
ax.plot(new_series,your_pandas_dataframe)
You can use the mdates functions declared above to tweak the labels and ticks to your pleasing, such as:
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(daysFmt)
ax.xaxis.set_minor_locator(hours)
Upvotes: 6
Reputation: 25692
I've found that the easiest way to do this is with ImageGrid
. Here's the code to do this and the plot + here is an IPython notebook that shows it in a more presentable format:
mons = ['Jan',
'Feb',
'Mar',
'Apr',
'May',
'Jun',
'Jul',
'Aug',
'Sep',
'Oct',
'Nov',
'Dec']
# just get the first 5 for illustration purposes
df = DataFrame(randn(201, len(mons)), columns=mons,
index=arange(1901, 2102))[:5]
from mpl_toolkits.axes_grid1 import ImageGrid
fig = figure(figsize=(20, 100))
grid = ImageGrid(fig, 111, nrows_ncols=(1, 1),
direction='row', axes_pad=0.05, add_all=True,
label_mode='1', share_all=False,
cbar_location='right', cbar_mode='single',
cbar_size='10%', cbar_pad=0.05)
ax = grid[0]
ax.set_title('A', fontsize=40)
ax.tick_params(axis='both', direction='out', labelsize=20)
im = ax.imshow(df.values, interpolation='nearest', vmax=df.max().max(),
vmin=df.min().min())
ax.cax.colorbar(im)
ax.cax.tick_params(labelsize=20)
ax.set_xticks(arange(df.shape[1]))
ax.set_xticklabels(mons)
ax.set_yticks(arange(df.shape[0]))
ax.set_yticklabels(df.index)
Upvotes: 3
Reputation: 521
I believe the issue has to do with specifying the tick labels for existing ticks. By default, there are fewer ticks than labels so only the first few labels are used. The following should work by first setting the number of ticks.
ax1 = fig.add_subplot(131)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
ax1.set_xticks(range(len(data.columns)))
ax1.set_xticklabels(data.columns)
ax1.set_yticks(range(len(data.index)))
ax1.set_yticklabels(data.index)
im1 = ax1.imshow(data, interpolation='nearest', aspect='auto', cmap=cmap)
This produces a tick for every year on the y-axis, so you might want to use a subset of the index values.
Upvotes: 9