Reputation: 1439
I am almost done with my first real deal python data science project. However, there is one last thing I can't seem to figure out. I have the following code to create a plot for my PCA and K Means clustering algorithm:
y_axis = passers_pca_kmeans['Component 1']
x_axis = passers_pca_kmeans['Component 2']
plt.figure(figsize=(10,8))
sns.scatterplot(x_axis, y_axis, hue=passers_pca_kmeans['Segment'], palette=['g','r','c','m'])
plt.title('Clusters by PCA Components')
plt.grid(zorder=0,alpha=.4)
texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name in zip(
passers_pca_kmeans['Component 2'], passers_pca_kmeans['Component 1'], passers_pca_kmeans.name)]
adjust_text(texts)
plt.show
adjustText
, but my plot has too many points to label them all; it looks like a mess with text everywhere.'Segment'
.
'first'
, 'second'
, 'third'
, 'fourth'
.adjustText
code to only annotate points where 'Segment'='first'
?
np.where
situation?Upvotes: 1
Views: 1303
Reputation: 8800
You could boolean slice your input into the text
call, something like:
mask = (passers_kca_means["Subject"] == "first")
x = passers_kca_means["Component 2"][mask]
y = passers_kca_means["Component 1"][mask]
names = passers_kca_means.name[mask]
texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name in zip(x,y,names)]
You could also make an unruly list comprehension by adding an if
condition:
x = passers_kca_means["Component 2"]
y = passers_kca_means["Component 1"]
names = passers_kca_means.name
subjects = passers_kca_means["Subject"]
texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name,subject in zip(x,y,names,subjects) if subject == "first"]
I bet there is an answer with np.where
as well.
Upvotes: 1