Reputation: 452
I have the following pandas dataframe:
Group Exp1 Exp2 Exp3 Exp4 Exp5 Control
0 1 0.005556 -0.101111 0.052632 -0.055556 0.033333 y
1 2 -0.115684 0.076667 -0.349497 0.555556 0.555556 n
2 3 0.184444 0.251397 0.022222 -0.444444 0.611650 n
3 4 0.075556 0.237778 0.368750 0.098901 -0.111111 n
4 5 -0.186916 -0.355556 0.172414 0.087120 0.034737 y
5 6 0.250000 0.152542 -0.395349 0.111111 0.000000 n
6 7 -0.025014 0.030000 0.594444 0.055556 0.311111 n
7 8 -0.062500 0.123333 0.317778 0.144444 0.288889 n
8 9 0.001111 0.141111 0.181111 0.011111 0.435897 n
9 10 -0.124444 -0.074241 0.074444 -0.111111 0.133333 y
Now the typical seaborn stripplot uses the rows to plot different categories. I would like to, however, have the different categories be the columns (the different experiments) and plot the 10 values for each group and each experiment vertically above the experiments marker on the x-axis. How do I achieve this?
Upvotes: 1
Views: 1997
Reputation: 80279
Seaborn usually works easiest with "long form" data, so with one column indicating the experiment and another the corresponding values. Seaborn also accepts some kinds of "wide" data for sufficiently simple structured dataframes. In this case, converting the "Group" column to an index would do the job.
So, it looks like:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
df = pd.DataFrame({f'Exp{i}': np.random.randn(10) for i in range(1, 6)})
df['Group'] = range(1, 11)
ax = sns.stripplot(data=df.set_index('Group'))
ax.xaxis.tick_top()
plt.show()
The wide form doesn't support hue
in this case (sns.stripplot(data=df.drop(columns=['Group', 'Control']), hue=df['Control'])
) gives an error telling that hue
is not supported when x
and y
are not explicitly set.
But the "long form" can be used.
Pandas melt()
converts a dataframe to the desired long form:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from io import StringIO
data_str = ''' Group Exp1 Exp2 Exp3 Exp4 Exp5 Control
0 1 0.005556 -0.101111 0.052632 -0.055556 0.033333 y
1 2 -0.115684 0.076667 -0.349497 0.555556 0.555556 n
2 3 0.184444 0.251397 0.022222 -0.444444 0.611650 n
3 4 0.075556 0.237778 0.368750 0.098901 -0.111111 n
4 5 -0.186916 -0.355556 0.172414 0.087120 0.034737 y
5 6 0.250000 0.152542 -0.395349 0.111111 0.000000 n
6 7 -0.025014 0.030000 0.594444 0.055556 0.311111 n
7 8 -0.062500 0.123333 0.317778 0.144444 0.288889 n
8 9 0.001111 0.141111 0.181111 0.011111 0.435897 n
9 10 -0.124444 -0.074241 0.074444 -0.111111 0.133333 y'''
df = pd.read_csv(StringIO(data_str), delim_whitespace=True)
###df.set_index('Group', inplace=True)
##ax = sns.stripplot(data=df.drop(columns=['Control']), hue=df['Control'])
long_df = df.melt(id_vars=['Group', 'Control'], var_name='Experiment', value_name='Value')
ax = sns.stripplot(data=long_df, x='Experiment', y='Value', hue='Control')
ax.xaxis.tick_top()
plt.tight_layout()
plt.show()
The long form of the dataframe looks like:
Group Control Experiment Value
0 1 y Exp1 0.005556
1 2 n Exp1 -0.115684
2 3 n Exp1 0.184444
3 4 n Exp1 0.075556
4 5 y Exp1 -0.186916
...
Upvotes: 1
Reputation: 48992
Drop the column you don't want to plot and pass the rest to data
:
sns.stripplot(data=df.drop("Group", axis=1))
It's good to learn how to do the full transformation to long-form data that @JohanC demonstrates, but also good to know how to take advantage of the wide-form data support when it fits with what you want to do.
Upvotes: 3