Reputation: 835
I have a pandas dataframe that looks like this:
class men woman children
0 first 0.91468 0.667971 0.660562
1 second 0.30012 0.329380 0.882608
2 third 0.11899 0.189747 0.121259
How would I create a plot using seaborn that looks like this? Do I have to rearrange my data in some way?
(source: mwaskom at stanford.edu)
Upvotes: 50
Views: 114485
Reputation:
Tested in python 3.12.0
, pandas 2.1.1
, matplotlib 3.8.0
, seaborn 0.13.0
Reshape the DataFrame with pandas.DataFrame.melt
or pandas.melt
:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# convert the dataframe to a long format
dfm = pd.melt(df, id_vars="class", var_name="sex", value_name="survival rate")
dfm
Out:
class sex survival rate
0 first men 0.914680
1 second men 0.300120
2 third men 0.118990
3 first woman 0.667971
4 second woman 0.329380
5 third woman 0.189747
6 first children 0.660562
7 second children 0.882608
8 third children 0.121259
Consolidate the plot by creating a single facet with grouped bars, instead of multiple facets with single bars.
Plot with the figure-level
method sns.catplot
g = sns.catplot(x='class', y='survival rate', hue='sex', data=dfm, kind='bar', height=5, aspect=1)
Plot with the axes-level
method sns.barplot
# the following code matches the plot produced by catplot
plt.figure(figsize=(5, 5))
ax = sns.barplot(x='class', y='survival rate', hue='sex', data=dfm)
ax.spines[['top', 'right']].set_visible(False)
sns.move_legend(ax, bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
factorplot
(v0.8.1 or earlier):
sns.factorplot(x='class', y='survival rate', hue='sex', data=df, kind='bar')
Upvotes: 89
Reputation: 198
To produce the plot in the OP, I used the following code, after converting the dataframe from wide-form to long-form.
Tested in python 3.12.0
, pandas 2.1.1
, matplotlib 3.8.0
, seaborn 0.13.0
Data:
d = {'class': ['first', 'second', 'third', 'first', 'second', 'third', 'first', 'second', 'third'], 'sex': ['men', 'men', 'men', 'woman', 'woman', 'woman', 'children', 'children', 'children'], 'survival_rate':[0.914680, 0.300120, 0.118990, 0.667971, 0.329380, 0.189747, 0.660562, 0.882608, 0.121259]}
df = pd.DataFrame(data=d)
g = sns.catplot(kind='bar', data=df, x='sex', y='survival_rate', col='class')
sns.factorplot("sex", "survival_rate", col="class", data=df, kind="bar")
Upvotes: 14