user2546580
user2546580

Reputation: 155

How to show a bar and line graph on the same plot

I am unable to show a bar and line graph on the same plot. Example code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Df = pd.DataFrame(data=np.random.randn(10,4), index=pd.DatetimeIndex(start='2005', freq='M', periods=10), columns=['A','B','C','D'])

fig = plt.figure()
ax = fig.add_subplot(111)

Df[['A','B']].plot(kind='bar', ax=ax)
Df[['C','D']].plot(ax=ax, color=['r', 'c'])

Upvotes: 11

Views: 20615

Answers (4)

Patrick FitzGerald
Patrick FitzGerald

Reputation: 3630

The issue is that the pandas bar plot function treats the dates as a categorical variable where each date is considered to be a unique category, so the x-axis units are set to integers starting at 0 (like the default DataFrame index when none is assigned).

The pandas line plot uses x-axis units corresponding to the DatetimeIndex, for which 0 is located on January 1970 and the integers count the number of periods (months in this example) since then. So let's take a look at what happens in this particular case:

import numpy as np     # v 1.19.2
import pandas as pd    # v 1.1.3

# Create random data
rng = np.random.default_rng(seed=1) # random number generator
df = pd.DataFrame(data=rng.normal(size=(10,4)),
                  index=pd.date_range(start='2005', freq='M', periods=10),
                  columns=['A','B','C','D'])

# Create a pandas bar chart overlaid with a pandas line plot using the same
# Axes: note that seeing as I do not set any variable for x, df.index is used
# by default, which is usually what we want when dealing with a dataset
# containing a time series
ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax);

pandas_bar_line_wrongx

The bars are nowhere to be seen. If you check what x ticks are being used, you'll see that the single major tick placed on January is 420 followed by these minor ticks for the other months:

ax.get_xticks(minor=True)
# [421, 422, 423, 424, 425, 426, 427, 428, 429]

This is because there are 35 years * 12 months since 1970, the numbering starts at 0 so January 2005 lands on 420. This explains why we do not see the bars. If you change the x-axis limit to start from zero, here is what you get:

ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax)
ax.set_xlim(0);

pandas_bar_line_setxlim

The bars are squashed to the left, starting on January 1970. This problem can be solved by setting use_index=False in the line plot function so that the lines also start at 0:

ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax, use_index=False)
ax.set_xticklabels(df.index.strftime('%b'), rotation=0, ha='center');

# # Optional: move legend to new position
# import matplotlib.pyplot as plt    # v 3.3.2
# ax.legend().remove()
# plt.gcf().legend(loc=(0.08, 0.14));

pandas_bar_line

In case you want more advanced tick label formatting, you can check out the answers to this question which are compatible with this example. If you need more flexible/automated tick label formatting as provided by the tick locators and formatters in the matplotlib.dates module, the easiest is to create the plot with matplotlib like in this answer.

Upvotes: 4

xpt
xpt

Reputation: 22994

I wanted to know as well, however all existing answers are not for showing bar and line graph on the same plot, but on different axis instead.

so I looked for the answer myself and have found an example that is working -- Plot Pandas DataFrame as Bar and Line on the same one chart. I can confirm that it works.

What baffled me was that, the almost same code works there but does not work here. I.e., I copied the OP's code and can verify that it is not working as expected.

The only thing I could think of is to add the index column to Df[['A','B']] and Df[['C','D']], but I don't know how since the index column doesn't have a name for me to add.

Today, I realize that even I can make it works, the real problem is that Df[['A','B']] gives a grouped (clustered) bar chart, but grouped (clustered) line chart is not supported.

Upvotes: 2

Michal
Michal

Reputation: 2007

You can also try this:

fig = plt.figure()
ax = DF['A','B'].plot(kind="bar");plt.xticks(rotation=0)
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(),DF['C','D'],marker='o')

Upvotes: 15

user1301404
user1301404

Reputation:

You can do something like that, both on the same figure:

In [4]: Df = pd.DataFrame(data=np.random.randn(10,4), index=pd.DatetimeIndex(start='2005', freq='M', periods=10), columns=['A','B','C','D'])

In [5]: fig, ax = plt.subplots(2, 1) # you can pass sharex=True, sharey=True if you want to share axes.

In [6]: Df[['A','B']].plot(kind='bar', ax=ax[0])
Out[6]: <matplotlib.axes.AxesSubplot at 0x10cf011d0>

In [7]: Df[['C','D']].plot(color=['r', 'c'], ax=ax[1])
Out[7]: <matplotlib.axes.AxesSubplot at 0x10a656ed0>

Upvotes: 0

Related Questions