Néstor
Néstor

Reputation: 23

Keep indices in a scatter plot of jaccard distance matrix

I have a disstance matrix and I wanted to plot it as 2D scatter plot.

I have found a way through sklearn.manifold:

mds = MDS(n_components=2, dissimilarity='precomputed')
X_r = mds.fit(jac_sim).embedding_
plt.figure()
plt.scatter(X_r[:,0],X_r[:,1],c="red")
plt.savefig((args.Directory + "/MDS2.svg"), format = "svg")

With jac_sim being my disstance matrix that looks something like this: enter image description here

This code gives me the next plot: enter image description here

I would like to carry the names of the columns or indices from the disstance matrix so I can color code the dots in the plot by Indiv number, and be able to put a label. I tried to check the X_r file but it only contains the coordinates of the scatter plot but no info of the origin.

How can I color code it by column/index name?

Upvotes: 0

Views: 646

Answers (1)

Guimoute
Guimoute

Reputation: 4649

If you know the size of your jac_sim will not change, you always know where are the Indiv numbers, so you could do two scatters using different slices of the data:

mds = MDS(n_components=2, dissimilarity='precomputed')
X_r = mds.fit(jac_sim).embedding_
plt.figure()
plt.scatter(X_r[:3:,0],X_r[:3:,1],c="red")
plt.scatter(X_r[3::,0],X_r[3::,1],c="blue")
plt.savefig((args.Directory + "/MDS2.svg"), format = "svg")

Upvotes: 0

Related Questions