Reputation: 2304
I have the following dataset:
df_plots = pd.DataFrame({'Group':['A','A','A','A','A','A','B','B','B','B','B','B'],
'Type':['X','X','X','Y','Y','Y','X','X','X','Y','Y','Y'],
'Value':[1,1.2,1.4,1.3,1.8,1.5,15,19,18,17,12,13]})
df_plots
Group Type Value
0 A X 1.0
1 A X 1.2
2 A X 1.4
3 A Y 1.3
4 A Y 1.8
5 A Y 1.5
6 B X 15.0
7 B X 19.0
8 B X 18.0
9 B Y 17.0
10 B Y 12.0
11 B Y 13.0
And I want to create boxplots per Group
(there are two in the example) and in each plot to show by type. I have tried this:
fig, axs = plt.subplots(1,2,figsize=(8,6), sharey=False)
axs = axs.flatten()
for i, g in enumerate(df_plots[['Group','Type','Value']].groupby(['Group','Type'])):
g[1].boxplot(ax=axs[i])
IndexError
, because the loop tries to create 4 plots.---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-12-8e1150950024> in <module>
3
4 for i, g in enumerate(df[['Group','Type','Value']].groupby(['Group','Type'])):
----> 5 g[1].boxplot(ax=axs[i])
IndexError: index 2 is out of bounds for axis 0 with size 2
Then I tried this:
fig, axs = plt.subplots(1,2,figsize=(8,6), sharey=False)
axs = axs.flatten()
for i, g in enumerate(df_plots[['Group','Type','Value']].groupby(['Group','Type'])):
g[1].boxplot(ax=axs[i], by=['Group','Type'])
But no, I have the same problem. The expected result should have only two plots, and each plot have a box-and-whisker per Type. This is a sketch of this idea:
Please, any help will be greatly appreciated, with this code I can control some aspects of the data that I can't with seaborn.
Upvotes: 5
Views: 7744
Reputation: 41327
As @Prune mentioned, the immediate issue is that your groupby()
returns four groups (AX, AY, BX, BY), so first fix the indexing and then clean up a couple more issues:
axs[i]
to axs[i//2]
to put groups 0 and 1 on axs[0]
and groups 2 and 3 on axs[1]
.positions=[i]
to place the boxplots side by side rather than stacked.title
and xticklabels
after plotting (I'm not aware of how to do this in the main loop).for i, g in enumerate(df_plots.groupby(['Group', 'Type'])):
g[1].boxplot(ax=axs[i//2], positions=[i])
for i, ax in enumerate(axs):
ax.set_title('Group: ' + df_plots['Group'].unique()[i])
ax.set_xticklabels(['Type: X', 'Type: Y'])
Note that mileage may vary depending on version:
matplotlib.__version__ |
pd.__version__ |
|
---|---|---|
confirmed working | 3.4.2 | 1.3.1 |
confirmed not working | 3.0.1 | 1.2.4 |
Upvotes: 3
Reputation: 35646
We can use groupby boxplot
to create subplots per Group
and then separate each boxplot
by Type
:
fig, axes = plt.subplots(1, 2, figsize=(8, 6), sharey=False)
df_plots.groupby('Group').boxplot(by='Type', ax=axes)
plt.show()
Or without subplots
by passing parameters directly through the function call:
axes = df_plots.groupby('Group').boxplot(by='Type', figsize=(8, 6),
layout=(1, 2), sharey=False)
plt.show()
Data and imports:
import pandas as pd
from matplotlib import pyplot as plt
df_plots = pd.DataFrame({
'Group': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
'Type': ['X', 'X', 'X', 'Y', 'Y', 'Y', 'X', 'X', 'X', 'Y', 'Y', 'Y'],
'Value': [1, 1.2, 1.4, 1.3, 1.8, 1.5, 15, 19, 18, 17, 12, 13]
})
Upvotes: 7
Reputation: 260975
Use seaborn.catplot
:
import seaborn as sns
sns.catplot(data=df, kind='box', col='Group', x='Type', y='Value', hue='Type', sharey=False, height=4)
Upvotes: 4
Reputation: 77857
The immediate problem is that your groupby
operation returns four elements (AX, AY, BX, BY), which you're trying to plot individually. You try to use ax=axs[i]
... but i
runs 0-3, while you have only the two elements in your flattened structure. There is no axs[2]
or axs[3]
, which raises the given run-time exception.
You need to resolve your referencing one way or the other.
Upvotes: 2