mrc
mrc

Reputation: 3173

Multidimensional scaling to data in python

I have the following code to apply the multidimensional scaling to sample of data called parkinsonData:

iterations=4
count=0
while(count<iterations):
    mds1=manifold.MDS(n_components=2, max_iter=3000)
    pos=mds1.fit(parkinsonData).embedding_
    plt.scatter(pos[:, 0], pos[:, 1])
    count=count+1

With this I get 4 different plots of this MDS algorithm, all of them are different due to the random seed. These plots have different color, but parkinsonData has a column called status that has 0 or 1 values and I want to show this difference in every plot with different color.

For example I want to achieve:

One plot with one color for 0 values in status field, and a different color for 1 values in status field.

Second plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first plot)

Third plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first and second plot)

Fourth plot with one color for 0 values in status field, and a different color for 1 values in status field. (Both colors differents from the first, second and third plot)

Anyone knows how to achieve this expected behavior?

Upvotes: 0

Views: 2398

Answers (2)

JeeyCi
JeeyCi

Reputation: 597

this will work:

# assign categories
categories = pd.factorize(df['species'])[0]
# use colormap
colormap= np.random.randint(100, size=(np.unique(categories).size))

iterations=4
count=0
while(count<iterations):
    mds1=manifold.MDS(n_components=2, max_iter=3000)
    pos=mds1.fit(X).embedding_
    plt.scatter(pos[:, 0], pos[:, 1], c=colormap[categories])
    count=count+1
    plt.legend(colormap[categories], categories)
    plt.show()

BUT matplot draws & assigns colors to legend only by columns, thus you will anyway need to do your legend manually if use matplot - that is not very convinient...

You can benefit by using seaborn & parameter hue - like this:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns   # <<<<<<<<<<<< use this

from sklearn.datasets import load_iris
from sklearn import  manifold

def loadDataSet():
    iris = load_iris()
    data = iris.data
    y = iris.target
    df= pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])

    df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)
    return df

df= loadDataSet()
X= df.iloc[:,:3].copy(True)
Y= df.iloc[:,4:].copy(True)

iterations=4
count=0
while(count<iterations):
    mds1=manifold.MDS(n_components=2, max_iter=3000)
    pos=mds1.fit(X).embedding_
    sns.scatterplot(x=pos[:, 0], y=pos[:, 1], data=df, hue=df['species'])
    count=count+1
    plt.show() 

Upvotes: 0

plasmon360
plasmon360

Reputation: 4199

you can do something like this

%matplotlib inline
import matplotlib.pyplot as plt

# example data
Y = [[ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6]]
X = [[ 1 , 2 , 4 ,5], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6], [ 1 , 2 , 3 ,6]]
status = [[0,1,0,0], [0,0,1,1], [1,1,0,0], [0,1,0,1]]

# create a list of list of unique colors for 4 plots
my_colors = [['red','green'],['blue','black'],['magenta','grey'],['purple','cyan']]


iterations=4
count=0
while(count<iterations):
    plt.figure()
    for i,j in enumerate(X):
        plt.scatter(X[count][i],Y[count][i],color = my_colors[count][status[count][i]])
    count=count+1
    plt.show()

results in (i am only attaching 2 images, but 4 images are created with unique color sets)

enter image description hereenter image description here

Upvotes: 1

Related Questions