Chob
Chob

Reputation: 93

Classifier.predict In Python

I have this code in Python:

def plot_decision_regions(X, y, classifier, resolution = 0.02):


    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])


    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
   xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())


for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Where X is a 100x2 vector with normal data (sepal and petla length for 2 kinds of flowers) , y is a 100x1 vector with only -1 and 1 values (class label vector) and Classifier = Perceptron . I don't know why I need to calculate the transpose

Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)

What does

classifier.predict 

and

x=X[y == cl, 0], y= X[y == cl, 1]

in

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

do?

I previously load a dataframe, define my predict method, define X and y

def predict(self,X):
    '''Return class label after unit step'''
    return np.where(self.net_input(X) >= 0.0, 1, -1)

And my class = Perceptron contains the w_ (weights) that are adjusted when iterating. Sorry if my english is not perfect

y = df.iloc[0:100 , 4] .values
y= np.where (y == 'Iris-setosa', -1, 1)

X= df.iloc[0:100, [0,2]].values

Upvotes: 1

Views: 1592

Answers (1)

Jay Mody
Jay Mody

Reputation: 4033

Let's break this down, first:

np.array([xx1.ravel(), xx2.ravel()])

.ravel() flattens the xx1 and xx2 arrays. xx1 and xx2 are just coordinates (for feature1 and feature2 respectively) arranged in a grid pattern. Idea is that xx1 and xx2 are coordinates at every resolution interval in the range of the feature-set. With enough of these coordinates, you can effectively know what regions are classified as what label by your classifier.

np.array([xx1.ravel(), xx2.ravel()]).T

The reason you need the transpose is because the .predict() method expects as input an array of size [n_samples, n_features]. The result of the ravelled array will be of size [n_features, n_samples], which is why we need to transpose.

classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T

This makes the predictions for each of the meshgrid points (which is then used to make a mask over the plot to show which regions are classified as what label by the classifier).

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Here, we plot our samples. We want to plot each class of samples seperately (in order to have them be different colors), so x=X[y == cl, 0] and y= X[y == cl, 1] are saying only plot the point at points where the label is equal to the current one we are inspecting (ie cl). cl will just be an iteration of all the unique possible labels.

It's easier to understand once you see what the result looks like (here's an example using a make_blobs dataset and an MLPClassifier:

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.colors import ListedColormap
from sklearn.datasets import make_blobs
from sklearn.neural_network import MLPClassifier

def plot_decision_regions(X, y, classifier, resolution = 0.02):
    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])

    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
    xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

colors = ['red', 'blue', 'green']
X, y = make_blobs(n_features=2, centers=3)

for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], label = cl, edgecolor = 'black')

classifier = MLPClassifier()
classifier.fit(X, y)

plot_decision_regions(X, y, classifier, resolution = 0.02)

You get: enter image description here

Upvotes: 2

Related Questions