Reputation: 439
I havecreated a bubble plot where I've measured true vs predicted label values, I wanted to know if it would be possible to change the plot shapes according to their data split. I want to maintain the colors of my plot per interval_size but just have the shape change according to data split.
min max y interval_size y_pred split
0.654531 1.021657 0.837415 0.367126 0.838094 train
0.783401 1.261898 1.000000 0.478497 1.022649 valid
-0.166070 0.543749 0.059727 0.709819 0.188840 train
0.493270 1.112610 0.504393 0.619340 0.802940 valid
0.140510 0.572957 0.479063 0.432447 0.356734 train
plt.figure(figsize=(16,8))
sns.set_context("talk", font_scale=1.1)
plt.figure(figsize=(10,6))
sns.scatterplot(x="y",
y="y_pred",
size="interval_size",
data=df,
alpha=0.65,
c=interval_size,
cmap='viridis',
hue = 'split',
s = (interval_size**2)*50)
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.01, 1),borderaxespad=0)
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.01, 0.54), borderaxespad=0.)
#Plot Characteristics
plt.title("True vs Predicted Labels", fontsize = 36)
plt.xlabel("True Labels", fontsize = 25)
plt.ylabel("Predicted Labels", fontsize = 25)
Question:
Validation data would be nice to include, how can I perhaps differentiate by shape, e.g. triangle/circle?
Upvotes: 1
Views: 181
Reputation: 503
Seaborn has a lot of in-depth customization packed into simple parameters. For your code, you simply want to add a keyword parameter to your sns.scatterplot() function:
style = 'split',
This will change the markers according to the categorical values, although it will pick the defaults. If you want more control over the specific markers being used, you can pass another parameter to map the categorical values to a specific marker:
markers = {'train': 'X', 'valid':'s'},
The marker codes can be found on the Matplotlib website (https://matplotlib.org/3.1.0/api/markers_api.html).
The final code should look like:
sns.scatterplot(x="y",
y="y_pred",
size="interval_size",
data=df,
alpha=0.65,
c=interval_size,
cmap='viridis',
hue = 'split',
s = (interval_size**2)*50,
style = 'split',
markers = {'train': 'X', 'valid':'s'},
)
Upvotes: 2