Seaborn: Hue dependent on two values

Question

I have following two dataframes that I would like to plot together. The first one (data) contains the complete data of different groups for several repeated experiments (=replicates) with the values for the individual cells within that experiment. The second one (avgs) summarizes the mean of each replicate experiment for all groups. I basically want to plot my data in the way suggested here.

data.head()

   cell   replicate     value           group   
0   1         1         0.029723        GROUP_A 
1   1         2         0.019136        GROUP_A     
2   2         2         0.020216        GROUP_A 
3   3         1         0.032020        GROUP_B
4   3         2         0.044815        GROUP_B

avgs.head()

          replicate     value           group   
0         1             0.019709        GROUP_A 
1         2             0.018937        GROUP_A     
2         1             0.358437        GROUP_B 
3         2             0.269602        GROUP_B
4         3             0.303252        GROUP_B

My aim is to achieve either the plots shown in B or C, where the hue depends on both the group and replicate.

import matplotlib.pyplot as plt
import seaborn as sns
sns.swarmplot(x="group", y="value", data=data, hue="replicate")
sns.swarmplot(x="group", y="value", data=avgs,size=8,hue="replicate", edgecolor="k", linewidth=2)

will give me basically the plot shown in A, with the hue corresponding to the replicate.

Is there a way to do this either with a different color palette for each group, so that the each group have different colors with each replicate having different shades of that color (example B, made in Affinity Designer)?

An alternative that would work for me is to plot the single cell values of data with a grey palette. However how can I achieve that when I add the replicate mean data of avgs, each group has a different color and each replicate mean has the corresponding shading in that color (example C)?

Is there the possibility to pass a palette dictionary to seaborn/matplotlib e.g. something like:

gray = sns.dark_palette("gray", n_colors=5)
red = sns.dark_palette("red", n_colors=5)
blue = sns.dark_palette("blue", n_colors=5)
 
my_palette={"GROUP_A": gray, "GROUP_B": red, "GROUP_C": blue}

Thanks!

JohanC · Accepted Answer

The groups can be plotted separately, each with its own palette. To make sure the x-positions are respected, the order= keyword needs to be set with all the desired x-labels.

Seaborn automatically adds legend entries for each call, so the legend can get very large. You can either suppress the legend, or limit it to the first few entries.

from matplotlib import pyplot as plt
import matplotlib
import numpy as np
import pandas as pd
import seaborn as sns

N = 500
data = pd.DataFrame({'replicate': np.random.choice(range(1, 4), N),
                     'value': 2 + np.random.uniform(-0.5, 0.5, (N, 5)).sum(axis=1),
                     'group': np.random.choice([f'GROUP_{g}' for g in 'ABCD'], N)})
groups = np.unique(data.group)
for g in groups:
    data.loc[data.group == g, 'value'] += np.random.uniform(0, 3)
avgs = data.groupby(['replicate', 'group']).mean()
avgs.reset_index(inplace=True)

my_palette = {"GROUP_A": 'Greys', "GROUP_B": 'Reds', "GROUP_C": 'Blues', "GROUP_D": 'Greens'}

for ind, g in enumerate(groups):
    sns.swarmplot(x="group", y="value", data=data[data.group == g], order=groups,
                  palette=my_palette[g], hue="replicate")
    sns.swarmplot(x="group", y="value", data=avgs[avgs.group == g], order=groups,
                  size=8, palette=my_palette[g], hue="replicate", edgecolor="k", linewidth=2)

# plt.gca().legend_.remove() # optionally suppress the legend
handles, labels = plt.gca().get_legend_handles_labels()
plt.legend(handles=handles[:3], title='replicate')
plt.tight_layout()
plt.show()

Seaborn: Hue dependent on two values

Answers (1)

Related Questions