Reputation: 47
I have a DF with the column 'category' and two columns 'T1' and 'T2'. What I've done so far is to plot a boxplot with 'category' and 'T1' and a 2nd boxplot with 'category' and 'T2'. 'category' contains 9 different variables. The dataset is about n=350.
If I create a 'normal' boxplot I get a plot with 9 boxplots in it. But I get 2 plots, one for T1 and T1 I want to display to each category 2 boxplots - T1 and T2. I have no idea how to start. I already read about grouped boxplots and don't think that it is the right way.
I created an example.
import pandas as pd
import seaborn as sns
data = {'Category': ['eins','zwei','drei', 'vier', 'fünf', 'sechs', 'sieben', 'acht', 'neun', 'eins','zwei','drei', 'vier', 'fünf',
'sechs', 'sieben', 'acht', 'neun'],
'T1': ['1', '6', '5','8', '4', '7', '5', '7', '1', '7', '3', '2', '1', '4', '7', '5', '7', '1'],
'T2':['3', '7', '7','9', '8', '10', '8', '9', '3', '10', '9', '5', '3', '8', '9', '6', '7', '5']}
df = pd.DataFrame(data)
df.loc[:, 'T1']=df.loc[:, 'T1'].astype('int')
df.loc[:, 'T2']=df.loc[:, 'T2'].astype('int')
sns.boxplot(x = df.loc[:,'T1'],
y = df.loc[:,'Category']);
sns.boxplot(x = df.loc[:,'T2'],
y = df.loc[:,'Category']);
I tried also this:
f, axes = plt.subplots()
sns.boxplot(x="T1",y="Category" ,data=df, palette="Set1")#,ax=axes[0])
sns.boxplot(x="T2",y="Category" ,data=df, palette="Set3")#,ax=axes[0])
#fig.tight_layout()
plt.show()
Then I get 2 plots in one graph. But they are overlaying eacxh other. How can I display the boxplot of T2 below the T1 of the respective category?
Upvotes: 2
Views: 53
Reputation: 33147
I would use the hue
parameter in the sns.boxplot
function:
import pandas as pd
import seaborn as sns
data = {'Category': ['eins','zwei','drei', 'vier', 'fünf', 'sechs', 'sieben', 'acht', 'neun', 'eins','zwei','drei', 'vier', 'fünf',
'sechs', 'sieben', 'acht', 'neun'],
'T1': ['1', '6', '5','8', '4', '7', '5', '7', '1', '7', '3', '2', '1', '4', '7', '5', '7', '1'],
'T2':['3', '7', '7','9', '8', '10', '8', '9', '3', '10', '9', '5', '3', '8', '9', '6', '7', '5']}
df = pd.DataFrame(data)
df.loc[:, 'T1']=df.loc[:, 'T1'].astype('int')
df.loc[:, 'T2']=df.loc[:, 'T2'].astype('int')
#sns.boxplot(x = df.loc[:,'T1'],
# y = df.loc[:,'Category']);
#sns.boxplot(x = df.loc[:,'T2'],
# y = df.loc[:,'Category']);
sns.boxplot(x='Category', y='value', hue='variable', data=df.melt(id_vars='Category', var_name='variable', value_name='value'))
plt.show()
Here, I used df.melt
to convert the 'T1' and 'T2' columns into a single 'value' column and a 'variable' column indicating which of these two columns the value came from.
You can print the df.melt(id_vars='Category', var_name='variable', value_name='value')
and see the output.
or
sns.boxplot(y='Category', x='value', hue='variable', data=df.melt(id_vars='Category', var_name='variable', value_name='value'))
Upvotes: 0