Reputation: 1201
I'm trying to understand how the RidgeClassifier
from sklearn.linear_model
works for Multi class case. I found a similar question here. However I am unable to follow
According to what I understand from the answer there.
import numpy as np
X = np.random.random((5,4))
y = [0,1,0,2,2]
############## This method #################
from sklearn.preprocessing import LabelBinarizer
y_new = LabelBinarizer().fit_transform(y)
from sklearn.linear_model import Ridge
r = Ridge()
r.fit(X,y_new)
r.coef_
############# is same as this ##############
from sklearn.linear_model import RidgeClassifier
clf = RidgeClassifier()
clf.fit(X,y)
clf.coef_
However the coef_
is not same. What actually happens for the Multi class case?
Upvotes: 0
Views: 1264
Reputation: 535
Your approach is totally right and you are in fact building the same output.
The LabelBinarizer
is converting your class values (1, 2, 3
) into a multi-output of binary values (1, 0, 0
, 0, 1, 0
, 0, 0, 1
). Applying a Ridge
regressor on this output will make your multi-class regression behave like a multi-class classification if you take the highest output for example.
The RidgeClassifier
converts the class values between -1
and 1
, because regression is more performant when the output is symmetrical. This is the main difference between the two approaches, with some sugar on top of it to improve performance and convergence.
Check the source of the RidgeClassifier
: source
In the fit
function, you will see the use of your LabelBinarizer
, but with parameters to make it between -1
and 1
.
Your final coefs are different, but it is normal because the raw output is different. However, if you take the maximum, you will normally get the same classification output.
Upvotes: 1