Reputation: 3173
I have the following code to apply the multidimensional scaling to sample of data called parkinsonData
:
iterations=4
count=0
while(count<iterations):
mds1=manifold.MDS(n_components=2, max_iter=3000)
pos=mds1.fit(parkinsonData).embedding_
plt.scatter(pos[:, 0], pos[:, 1])
count=count+1
With this I get 4 different plots of this MDS algorithm, all of them are different due to the random seed. These plots have different color, but parkinsonData
has a column called status
that has 0 or 1 values and I want to show this difference in every plot with different color.
For example I want to achieve:
One plot with one color for 0 values in status field, and a different color for 1 values in status field.
Second plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first plot)
Third plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first and second plot)
Fourth plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first, second and third plot)
Anyone knows how to achieve this expected behavior?
Upvotes: 0
Views: 2398
Reputation: 597
this will work:
# assign categories
categories = pd.factorize(df['species'])[0]
# use colormap
colormap= np.random.randint(100, size=(np.unique(categories).size))
iterations=4
count=0
while(count<iterations):
mds1=manifold.MDS(n_components=2, max_iter=3000)
pos=mds1.fit(X).embedding_
plt.scatter(pos[:, 0], pos[:, 1], c=colormap[categories])
count=count+1
plt.legend(colormap[categories], categories)
plt.show()
BUT matplot draws & assigns colors to legend only by columns, thus you will anyway need to do your legend manually if use matplot - that is not very convinient...
You can benefit by using seaborn & parameter hue - like this:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns # <<<<<<<<<<<< use this
from sklearn.datasets import load_iris
from sklearn import manifold
def loadDataSet():
iris = load_iris()
data = iris.data
y = iris.target
df= pd.DataFrame(data= np.c_[iris['data'], iris['target']],
columns= iris['feature_names'] + ['target'])
df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)
return df
df= loadDataSet()
X= df.iloc[:,:3].copy(True)
Y= df.iloc[:,4:].copy(True)
iterations=4
count=0
while(count<iterations):
mds1=manifold.MDS(n_components=2, max_iter=3000)
pos=mds1.fit(X).embedding_
sns.scatterplot(x=pos[:, 0], y=pos[:, 1], data=df, hue=df['species'])
count=count+1
plt.show()
Upvotes: 0
Reputation: 4199
you can do something like this
%matplotlib inline
import matplotlib.pyplot as plt
# example data
Y = [[ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6]]
X = [[ 1 , 2 , 4 ,5], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6]]
status = [[0,1,0,0], [0,0,1,1], [1,1,0,0], [0,1,0,1]]
# create a list of list of unique colors for 4 plots
my_colors = [['red','green'],['blue','black'],['magenta','grey'],['purple','cyan']]
iterations=4
count=0
while(count<iterations):
plt.figure()
for i,j in enumerate(X):
plt.scatter(X[count][i],Y[count][i],color = my_colors[count][status[count][i]])
count=count+1
plt.show()
results in (i am only attaching 2 images, but 4 images are created with unique color sets)
Upvotes: 1