Reputation: 93
Before asking the questions, I will make an introduction. The code below does not present any error, however, in theory the algorithm should graph figure 2, but graph figure 1. Looking for an algorithm similar to mine on the internet, I found it, but exactly the same problem occurs. I have also noticed that the libraries are being updated, and that each time they are better. I Compared the misclassifications samples of a flower type for LinearRegression and previously (I don't remember the version) there were 8 or 7, with the most recent version the misclassifications samples are 1.
Question 1: Do I need to make any changes to the code to obtain figure 2? If the answer is no, what is the interpretation of figure 1 ?. If the answer is yes, what should you modify and why?
Question 2: How can I see the library updates, specifically the scikit learn algorithms?
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
sc= StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)
lr=LogisticRegression(C=100.0, random_state=1)
lr.fit(X_train_std,y_train)
weights, params = [], []
for c in np.arange(-5, 5):
lr = LogisticRegression(C=10.**c, random_state=1)
lr.fit(X_train_std, y_train)
weights.append(lr.coef_[1])
params.append(10.**c)
weights = np.array(weights)
plt.plot(params, weights[:, 0], color='blue', marker='x', label='petal length')
plt.plot(params, weights[:, 1], color='green', marker='o', label='petal width')
plt.ylabel('weight coefficient')
plt.xlabel('C')
plt.legend(loc='right')
plt.xscale('log')
plt.show()
Upvotes: 1
Views: 58
Reputation: 1383
This worked for me:
lr = LogisticRegression(C=10.**c, random_state=1, multi_class='ovr')
multi_class
was one of the parameters that had default values changed in recent releases from scikit-learn. (The other parameter was solver
). For some reason turning it into a binary problem solves the issue, but not much of a clue on what's going on to be honest. :)
Upvotes: 1