Reputation: 2724
I am working with tips data set, and here is the head of data set.
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
My code is
sns.violinplot(x='day',y='total_bill',data=tips, hue=['sex','smoker'])
I want a violinplot of day with total_bill in which hue is sex and smoker, but I can not find any option to set multiple values of hue
. Is there any way?
Upvotes: 20
Views: 21248
Reputation: 62553
'day'
/'sex'
or 'day'
/'smoker'
, set that as x=
, use 'smoker'
or 'sex'
, respectively, as hue=
, and set split=True
.python 3.10
, pandas 1.4.2
, matplotlib 3.5.1
, seaborn 0.11.2
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# load sample data
tips = sns.load_dataset("tips")
# create a new column
tips['Day - Sex'] = tips.day.astype(str) + ' - ' + tips.sex.astype(str)
# set to categorical to specify an order
categories = ['Thur - Female', 'Thur - Male', 'Fri - Female', 'Fri - Male', 'Sat - Female', 'Sat - Male', 'Sun - Female', 'Sun - Male']
tips['Day - Sex'] = pd.Categorical(tips['Day - Sex'], categories=categories, ordered=True)
# plot
fig, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x='Day - Sex', y='total_bill', data=tips, hue='smoker', ax=ax, split=True)
Upvotes: 2
Reputation: 1793
The faceting approach suggested by the accepted answer is probably nicer in this case, but might not be easily applicable to other kinds of Seaborn plots (e.g. in my case, ecdfplot
). So I just wanted to share that I figured out a solution which does what OP originally asked, i.e. actually use multiple columns for the hue
parameter.
The trick is that hue can either be a column name, or a sequence of the same length as your data, listing the color categories to assign each data point to. So...
sns.violinplot(x='day', y='total_bill', data=tips, hue='sex')
... is basically the same as:
sns.violinplot(x='day', y='total_bill', data=tips, hue=tips['sex'])
You typically wouldn't use the latter, it's just more typing to achieve the same thing -- unless you want to construct a custom sequence on the fly:
sns.violinplot(x='day', y='total_bill', data=tips,
hue=tips[['sex', 'smoker']].apply(tuple, axis=1))
The way you build the sequence that you pass via hue
is entirely up to you, the only requirement is that it must have the same length as your data, and if an array-like, it must be one-dimensional, so you can't just pass hue=tips[['sex', 'smoker']]
, you have to somehow concatenate the columns into one. I chose tuple
as the most versatile way, but if you want to have more control over the formatting, build a Series
of strings (saving it into a separate variable here for better readability, but you don't have to):
hue = tips['sex'].astype(str) + ', ' + tips['smoker'].astype(str)
sns.violinplot(x='day', y='total_bill', data=tips, hue=hue)
Upvotes: 29
Reputation: 12524
You could use a seaborn.catplot
in order to use 'sex'
as hue
and 'smoker'
as column for generating two side by side violinplot.
Check this code:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
tips = sns.load_dataset("tips")
sns.catplot(x = "day",
y = "total_bill",
hue = "sex",
col = "smoker",
data = tips,
kind = "violin",
split = True)
plt.show()
which gives me this plot:
Upvotes: 7