Reputation: 51
I wanted to use different marker style for this binary response data. However, I couldn't integrate one here. I am interested to use triangle marker for one class, and start marker for another class. Also, it would be nice to use customize legend like 1 for majority class, and 0 for minority class. I appreciate your suggestions. Thanks!
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs
from numpy.random import seed
seed(133)
X, y = make_blobs(n_samples=[1000, 10],centers=[[0.0, 0.0], [2.0, 2.0]],cluster_std= [1.5, 0.5],random_state=0, shuffle=False)
colormap = np.array(['tab:orange', 'tab:blue'])
plt.scatter(X[:, 0], X[:, 1],s=40, c=colormap[y], cmap=plt.cm.Paired, edgecolors='k')
Upvotes: 4
Views: 2398
Reputation: 41327
If you are willing to use seaborn, sns.scatterplot
has a style
param to specify marker groups:
style
: vector or key in dataGrouping variable that will produce points with different markers. Can have a numeric dtype but will always be treated as categorical.
Then the markers
param allows you to specify the markers for each style
level:
sns.scatterplot(x=X[:,0], y=X[:,1], s=40, hue=y, style=y, markers=['*','^'])
However with pure plt.scatter
, the marker
param only accepts a single value, so you should plot a separate scatter
per class:
marker = ['*', '^']
for group in set(y):
plt.scatter(
X[y==group, 0], X[y==group, 1], # filter by group
marker=marker[group], # set marker per group
label=group, # set legend label
s=40, c=colormap[group], cmap=plt.cm.Paired, edgecolors='k',
)
plt.legend()
This assumes your classes are 0 and 1 like your example. If the real classes are labeled something else, you should enumerate
the loop to access a numeric index:
for index, group in enumerate(set(y)):
...
Upvotes: 4