nostalghia
nostalghia

Reputation: 15

Matplotlib: Boxplot and bar chart shifted when overlaid using twinx

When I create a barplot and overlay a bar chart using twin x then the boxes appear shifted by one to the right compared to the bars.

This problem has been identified before (Python pandas plotting shift x-axis if twinx two y-axes), but the solution no longer seems to work. (I am using Matplotlib 3.1.0)

li_str = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten']

df = pd.DataFrame([[i]+j[k] for i,j in {li_str[i]:np.random.randn(j,2).tolist() for i,j in \
    enumerate(np.random.randint(5, 15, len(li_str)))}.items() for k in range(len(j))]
    , columns=['A', 'B', 'C'])

fig, ax = plt.subplots(figsize=(16,6))
ax2 = ax.twinx()
df_gb = df.groupby('A').count()
p1 = df.boxplot(ax=ax, column='B', by='A', sym='')
p2 = df_gb['B'].plot(ax=ax2, kind='bar', figsize=(16,6)
    , colormap='Set2', alpha=0.3, secondary_y=True)
plt.ylim([0, 20])

The problematic chart

The output shows the boxes shifted to the right by one compared to the bars. The respondent of the previous post rightly pointed out that the tick-locations of the bars are zero-based and the tick-locations of the boxes are one-based, which is causing the shift. However, the plt.bar() method the respondent uses to fix it, now throws an error, since an x-parameter has been made mandatory. If the x-parameter is provided it still throws an error because there is no parameter 'left' anymore.

df.boxplot(column='B', by='A')
plt.twinx()
plt.bar(left=plt.xticks()[0], height=df.groupby('A').count()['B'],
  align='center', alpha=0.3)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-186-e257461650c1> in <module>
     26 plt.twinx()
     27 plt.bar(left=plt.xticks()[0], height=df.groupby('A').count()['B'],
---> 28         align='center', alpha=0.3)

TypeError: bar() missing 1 required positional argument: 'x'

In addition, I would much prefer a fix using the object-oriented approach with reference to the axes, because I want to place the chart into an interactive ipywidget.

Here is the ideal chart:

Ideal chart

Many thanks.

Upvotes: 1

Views: 3496

Answers (2)

Deepanker Singh
Deepanker Singh

Reputation: 1

From matplotlib.pyplot.boxplot():

positions : array-like, optional

The positions of the boxes. The ticks and limits are automatically set to match the positions. Defaults to range(1, N+1) where N is the number of boxes to be drawn.

The default index for boxplot are [1, len(df_gb['B'])], so doing

df.boxplot(column='B', by='A', ax=ax, position=[x for x in range(len(df_gb['B']))])

would also do the trick.

Boxplot grouped by A

Upvotes: 0

Sheldore
Sheldore

Reputation: 39072

You can use the following trick: Provide the x-values for placing your bars starting at x=1. To do so, use range(1, len(df_gb['B'])+1) as the x-values.

fig, ax = plt.subplots(figsize=(8, 4))
ax2 = ax.twinx()
df_gb = df.groupby('A').count()
df.boxplot(column='B', by='A', ax=ax)
ax2.bar(range(1, len(df_gb['B'])+1), height=df_gb['B'],align='center', alpha=0.3)

enter image description here

Upvotes: 0

Related Questions