Aghyad Skaif
Aghyad Skaif

Reputation: 49

Visualizing SOM and adding labels to the map

I have been trying to apply SOM on my dataframe, my dataframe has 25 columns where each column represents a house, each house has a values for power consumption for two years, and I want to cluster the data with number of clusters = 3. I have done the following:

import sys
sys.path.insert(0, '../')
%load_ext autoreload

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pylab import plot,axis,show,pcolor,colorbar,bone
from matplotlib.patches import Patch
%matplotlib inline

from minisom import MiniSom
from sklearn.preprocessing import minmax_scale, scale
%autoreload 2
data1 = pd.read_excel(r"C:\Users\user\Desktop\Thesis\Tarek\Consumption.xlsx")
data1['h1'] = data1['h1'].str.split(';').str[2].astype('float')
data1['h2'] = data1['h2'].str.split(';').str[2].astype('float')
data1['h3'] = data1['h3'].str.split(';').str[2].astype('float')
data1['h4'] = data1['h4'].str.split(';').str[2].astype('float')
data1['h5'] = data1['h5'].str.split(';').str[2].astype('float')
data1['h6'] = data1['h6'].str.split(';').str[2].astype('float')
data1['h7'] = data1['h7'].str.split(';').str[2].astype('float')
data1['h8'] = data1['h8'].str.split(';').str[2].astype('float')
data1['h9'] = data1['h9'].str.split(';').str[2].astype('float')
data1['h10'] = data1['h10'].str.split(';').str[2].astype('float')
data1['h11'] = data1['h11'].str.split(';').str[2].astype('float')
data1['h12'] = data1['h12'].str.split(';').str[2].astype('float')
data1['h13'] = data1['h13'].str.split(';').str[2].astype('float')
data1['h14'] = data1['h14'].str.split(';').str[2].astype('float')
data1['h15'] = data1['h15'].str.split(';').str[2].astype('float')
data1['h16'] = data1['h16'].str.split(';').str[2].astype('float')
data1['h17'] = data1['h17'].str.split(';').str[2].astype('float')
data1['h18'] = data1['h18'].str.split(';').str[2].astype('float')
data1['h19'] = data1['h19'].str.split(';').str[2].astype('float')
data1['h20'] = data1['h20'].str.split(';').str[2].astype('float')
data1['h21'] = data1['h21'].str.split(';').str[2].astype('float')
data1['h22'] = data1['h22'].str.split(';').str[2].astype('float')
data1['h23'] = data1['h23'].str.split(';').str[2].astype('float')
data1['h24'] = data1['h24'].str.split(';').str[2].astype('float')
data1['h25'] = data1['h25'].str.split(';').str[2].astype('float')
data1.fillna(0,inplace=True)
data1=data1.round(decimals=2)
X=data1.values
som =MiniSom(x=3,y=3,input_len=25,sigma=1.0, learning_rate=0.5)
som.random_weights_init(X)
som.train_batch(data=X ,num_iteration=1000,verbose=True)
bone()
pcolor(som.distance_map().T)
colorbar()
markers = ['o' , 's','v']
colors = ['r', 'g','y']
for i, x in enumerate(X):
    w = som.winner(x)
    plot(w[0] + 0.5,
         w[1] + 0.5,
         markers[i],
         markeredgecolor = colors[i],
         markerfacecolor = 'None',
         markersize = 10,
         markeredgewidth = 2)
show()

when I am running the code, I am getting this error: IndexError: list index out of range please any tips to add the markers and colors in the right way without having any problems, and I would be glad if any one can help, I am a bit new to Python and tried to find a solution but I couldn`t find any.

Upvotes: 0

Views: 295

Answers (1)

Sheldore
Sheldore

Reputation: 39072

The problem seems to be that the length of your X=data1.values is around 25 but the length of your markers and colors is only 3. So in the following for loop, when i is 3, you are trying to access markers[3] and colors[3] which throws an IndexError because both markers and colors goes up to index 2 (indexing starts from 0 in python)

for i, x in enumerate(X):

One solution is to define custom list of 25 markers and 25 colors. While you might want to define your own markers, you can leave the colors out and let the code choose automatic colors for the markeredgecolor

Upvotes: 0

Related Questions