leekimber
leekimber

Reputation: 33

How to auto-deduce axes in pandas plot()

I am struggling to replicate the elegant ease - and successful outcome - teasingly promised in the 'Basic Plotting:plot' section of the pandas df.plot() documentation at:

http://pandas.pydata.org/pandas-docs/stable/visualization.html#visualization

There the authors' first image is pretty close to the kind of line-graph I want to plot from my dataframe. Their first df and resulting plot is a single-liner just as I hoped my df below would look when plotted.

My df looks like this:

            2014-03-28  2014-04-04  2014-04-11  2014-04-18  \
Jenny Todd    1699.6      1741.6      1710.7      1744.2   

            2014-04-25  2014-05-02  2014-05-09  
Jenny Todd    1764.2      1789.7      1802.3 

Their second image is a multi-line graph very similar to what I hoped for when I try to plot a multiple-index version of my df. Eg:

                    2014-06-13  2014-06-20  2014-06-27  \
William Acer        1674.7      1689.4      1682.0   
Katherine Baker     1498.5      1527.3      1530.5   


                    2014-07-04  2014-07-11  2014-07-18  \
William Acer        1700.0      1674.5      1677.8   
Katherine Baker     1540.4      1522.3      1537.3   

                    2014-07-25  
William Acer        1708.0  
Katherine Baker     1557.1

However, they get plots. I get featureless 3.3kb images and a warning:

/home/lee/test/local/lib/python2.7/site-packages/matplotlib/axes/_base.py:2787: UserWarning: Attempting to set identical left==right results in singular transformations; automatically expanding. left=0.0, right=0.0 'left=%s, right=%s') % (left, right))

The authors of the documentation seem to have the plot() function deducing from the df's indexes the values of the x-axis and the range and values of the y axis.

Searching around, I can find people with different data, different indexes and different scenarios (for example, plotting one column against another or trying to produce multiple subplots) who get this kind of 'axes' error. However, I haven't been able to map their issues to mine.

I wonder if anyone can help resolve what is different about my data or code that leads to a different plot outcome from the documentation's seemingly-similar data and seemingly-similar code.

My code:

print plotting_df # (This produces the df examples I pasted above)
plottest = plotting_df.plot.line(title='Calorie Intake', legend=True)
plottest.set_xlabel('Weeks')
plottest.set_ylabel('Calories')
fig = plt.figure()
plot_name = week_ending + '_' + collection_name + '.png'
fig.savefig(plot_name)

Note this dataframe is being created dynamically many times within the script. On any given run, the script will acquire different sets of dates, differently-named people, and different numbers to plot. So I don't have predictability about what strings will come up for index and legend labels for plotting beforehand. I do have predictability about the format.

I get that my dataframe's date index has differently-formatted dates than the referred documentation describes. Is this the cause? Whether it is or isn't, how should one best handle this issue?

Added on 2016-08-24 to answer the comment below about being unable to recreate my data

plotting_df is created on the fly as a subset of a much larger dataframe. It's simply an index (or sometimes multiple indices) and some of the date columns extracted from the larger dataframe. The code that produces plotting_df works fine and always produces plotting_df with correct indices and columns in a format I expect.

I can simulate creation of a dataset to store in plotting_df with this python code:

plotting_1 = {
          '2014-03-28': 1699.6,
          '2014-04-04': 1741.6,
          '2014-04-11': 1710.7,
          '2014-04-18': 1744.2,
          '2014-04-25': 1764.2,
          '2014-05-02': 1789.7,
          '2014-05-09': 1802.3
        }

plotting_df = pd.DataFrame(plotting_1, index=['Jenny Todd'])

and I can simulate creation of a multiple-indices plotting_df with this python code:

plotting_2 = {
            'Katherine Baker': {
                '2014-06-13': 1498.5,
                '2014-06-20': 1527.3,
                '2014-06-27': 1530.5,
                '2014-07-04': 1540.4,
                '2014-07-11': 1522.3,
                '2014-07-18': 1537.3,
                '2014-07-25': 1557.1
            },
            'William Acer': {
                '2014-06-13': 1674.7,
                '2014-06-20': 1689.4,
                '2014-06-27': 1682.0,
                '2014-07-04': 1700.0,
                '2014-07-11': 1674.5,
                '2014-07-18': 1677.8,
                '2014-07-25': 1708.0
            }
}

plotting_df = pd.DataFrame.from_dict(plotting_2)

I did try the suggested transform with code:

plotdf = plotting_df.T
plotdf.index = pd.to_datetime(plotdf.index)

so that my original code now looks like:

print plotting_df # (This produces the df examples I pasted above)
plotdf = plotting_df.T # Transform the df - date columns to indices
plotdf.index = pd.to_datetime(plotdf.index) # Convert indices to datetime
plottest = plotdf.plot.line(title='Calorie Intake', legend=True)
plottest.set_xlabel('Weeks')
plottest.set_ylabel('Calories')
fig = plt.figure()
plot_name = week_ending + '_' + collection_name + '.png' 
fig.savefig(plot_name)

but I still get the same result (blank 3.3kb images created).

I did note that adding the transform made no difference when I printed out the first instance of plotdf. So should be I doing some other transform?

Upvotes: 3

Views: 247

Answers (1)

jsignell
jsignell

Reputation: 3272

This is your problem:

fig = plt.figure()
plot_name = week_ending + '_' + collection_name + '.png' 
fig.savefig(plot_name)

You are creating a second figure after creating the first one and then you are saving only that second empty figure. Just take out the line fig = plt.figure() and change fig.savefig to plt.savefig

So you should have :

print plotting_df # (This produces the df examples I pasted above)
plotdf = plotting_df.T # Transform the df - date columns to indices
plotdf.index = pd.to_datetime(plotdf.index) # Convert indices to datetime
plottest = plotdf.plot.line(title='Calorie Intake', legend=True)
plottest.set_xlabel('Weeks')
plottest.set_ylabel('Calories')
plot_name = week_ending + '_' + collection_name + '.png' 
plt.savefig(plot_name)

Upvotes: 1

Related Questions