Reputation: 93
I have this code in Python:
def plot_decision_regions(X, y, classifier, resolution = 0.02):
markers = ('s', 'x', 'o', '^','v')
colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
cmap = ListedColormap(colors[:len(np.unique(y))])
x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
Z = Z.reshape(xx1.shape)
plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
plt.xlim(xx1.min(), xx1.max())
plt.ylim(xx2.min(), xx2.max())
for idx, cl in enumerate (np.unique(y)):
plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')
Where X
is a 100x2 vector with normal data (sepal and petla length for 2 kinds of flowers) , y
is a 100x1 vector with only -1 and 1 values (class label vector) and Classifier = Perceptron
. I don't know why I need to calculate the transpose
Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
What does
classifier.predict
and
x=X[y == cl, 0], y= X[y == cl, 1]
in
plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')
do?
I previously load a dataframe, define my predict method, define X
and y
def predict(self,X):
'''Return class label after unit step'''
return np.where(self.net_input(X) >= 0.0, 1, -1)
And my class = Perceptron
contains the w_
(weights) that are adjusted when iterating. Sorry if my english is not perfect
y = df.iloc[0:100 , 4] .values
y= np.where (y == 'Iris-setosa', -1, 1)
X= df.iloc[0:100, [0,2]].values
Upvotes: 1
Views: 1592
Reputation: 4033
Let's break this down, first:
np.array([xx1.ravel(), xx2.ravel()])
.ravel()
flattens the xx1
and xx2
arrays. xx1
and xx2
are just coordinates (for feature1 and feature2 respectively) arranged in a grid pattern. Idea is that xx1
and xx2
are coordinates at every resolution
interval in the range of the feature-set. With enough of these coordinates, you can effectively know what regions are classified as what label by your classifier.
np.array([xx1.ravel(), xx2.ravel()]).T
The reason you need the transpose is because the .predict()
method expects as input an array of size [n_samples, n_features]
. The result of the ravelled array will be of size [n_features, n_samples]
, which is why we need to transpose.
classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T
This makes the predictions for each of the meshgrid points (which is then used to make a mask over the plot to show which regions are classified as what label by the classifier).
plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')
Here, we plot our samples. We want to plot each class of samples seperately (in order to have them be different colors), so x=X[y == cl, 0]
and y= X[y == cl, 1]
are saying only plot the point at points where the label is equal to the current one we are inspecting (ie cl
). cl
will just be an iteration of all the unique possible labels.
It's easier to understand once you see what the result looks like (here's an example using a make_blobs
dataset and an MLPClassifier
:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn.datasets import make_blobs
from sklearn.neural_network import MLPClassifier
def plot_decision_regions(X, y, classifier, resolution = 0.02):
markers = ('s', 'x', 'o', '^','v')
colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
cmap = ListedColormap(colors[:len(np.unique(y))])
x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
Z = Z.reshape(xx1.shape)
plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
plt.xlim(xx1.min(), xx1.max())
plt.ylim(xx2.min(), xx2.max())
colors = ['red', 'blue', 'green']
X, y = make_blobs(n_features=2, centers=3)
for idx, cl in enumerate (np.unique(y)):
plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], label = cl, edgecolor = 'black')
classifier = MLPClassifier()
classifier.fit(X, y)
plot_decision_regions(X, y, classifier, resolution = 0.02)
Upvotes: 2