tbc
tbc

Reputation: 1749

pandas, matplotlib, use dataframe index as axis tick labels

I am using matplotlib's imshow() function to show a pandas.DataFrame.

I would like the labels and ticks for both x and y axes to be drawn from the DataFrame.index and DataFrame.columns lists, but I can't figure out how to do it.

Assuming that data is a pandas.DataFrame:

>>> print data
<class 'pandas.core.frame.DataFrame'>
Index: 201 entries,  1901 to  2101
Data columns:
jan    201  non-null values
feb    201  non-null values
mar    201  non-null values
apr    201  non-null values
may    201  non-null values
jun    201  non-null values
jul    201  non-null values
aug    201  non-null values
sep    201  non-null values
oct    201  non-null values
nov    201  non-null values
dec    201  non-null values

When I do this:

ax1 = fig.add_subplot(131, xticklabels=data.columns, yticklabels=data.index)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
im1 = ax1.imshow(data, 
                 interpolation='nearest', 
                 aspect='auto',
                 cmap=cmap )

I end up with nicely spaced tick labels on the y axis of the image, but the labels are 1901-1906 instead of 1901 thru 2101. Likewise, the x axis tick labels are feb-jul instead of jan-dec.

If I use

ax1 = fig.add_subplot(131) # without specifying tick labels

Then I end up with the axis tick labels simply being the underlying ndarray index values (i.e. 0-201 and 0-12). I don't need to modify the spacing or quantity of ticks and labels, I just want the label text to come from the DataFrame index and column lists. Not sure if I am missing something easy or not?

Thanks in advance.

Upvotes: 9

Views: 16243

Answers (3)

pjw
pjw

Reputation: 2345

As a general solution, I have found the following method to be an easy way to bring a Pandas datetime64 index into a matplotlib axis label.

First, create a new series by converting the pandas datetime64 index to a Python datetime.datetime class.

new_series = your_pandas_dataframe.index.to_pydatetime()

Now you have all the functionality of matplotlib.dates. Before plotting, import matplotlib.dates as mdates and declare the following variables:

years = mdates.YearLocator()   
months = mdates.MonthLocator()  
days = mdates.DayLocator()
hours = mdates.HourLocator(12) #if you want ticks every 12 hrs, you can pass 12 to this function
minutes = mdates.MinuteLocator() 
daysFmt = mdates.DateFormatter('%m/%d') #or whatever format you want

Now, make your plots, using the new_series as the x-axis:

fig1 = plt.figure()
ax = fig1.add_subplot(111)
ax.plot(new_series,your_pandas_dataframe)

You can use the mdates functions declared above to tweak the labels and ticks to your pleasing, such as:

ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(daysFmt)
ax.xaxis.set_minor_locator(hours)

Upvotes: 6

Phillip Cloud
Phillip Cloud

Reputation: 25692

I've found that the easiest way to do this is with ImageGrid. Here's the code to do this and the plot + here is an IPython notebook that shows it in a more presentable format:

mons = ['Jan',
 'Feb',
 'Mar',
 'Apr',
 'May',
 'Jun',
 'Jul',
 'Aug',
 'Sep',
 'Oct',
 'Nov',
 'Dec']

# just get the first 5 for illustration purposes
df = DataFrame(randn(201, len(mons)), columns=mons,
               index=arange(1901, 2102))[:5]

from mpl_toolkits.axes_grid1 import ImageGrid
fig = figure(figsize=(20, 100))
grid = ImageGrid(fig, 111, nrows_ncols=(1, 1),
                 direction='row', axes_pad=0.05, add_all=True,
                 label_mode='1', share_all=False,
                 cbar_location='right', cbar_mode='single',
                 cbar_size='10%', cbar_pad=0.05)

ax = grid[0]
ax.set_title('A', fontsize=40)
ax.tick_params(axis='both', direction='out', labelsize=20)
im = ax.imshow(df.values, interpolation='nearest', vmax=df.max().max(),
               vmin=df.min().min())
ax.cax.colorbar(im)
ax.cax.tick_params(labelsize=20)
ax.set_xticks(arange(df.shape[1]))
ax.set_xticklabels(mons)
ax.set_yticks(arange(df.shape[0]))
ax.set_yticklabels(df.index)

enter image description here

Upvotes: 3

zarthur
zarthur

Reputation: 521

I believe the issue has to do with specifying the tick labels for existing ticks. By default, there are fewer ticks than labels so only the first few labels are used. The following should work by first setting the number of ticks.

ax1 = fig.add_subplot(131)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
ax1.set_xticks(range(len(data.columns)))
ax1.set_xticklabels(data.columns)
ax1.set_yticks(range(len(data.index)))
ax1.set_yticklabels(data.index)
im1 = ax1.imshow(data, interpolation='nearest', aspect='auto', cmap=cmap)

This produces a tick for every year on the y-axis, so you might want to use a subset of the index values.

Upvotes: 9

Related Questions