Reputation: 481
How can I use a kernel in a logistic regression model using the sklearn library?
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
y_pred = logreg.predict(X_test)
print(y_pred)
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
predicted= logreg.predict(predict)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Upvotes: 11
Views: 6799
Reputation: 33147
Very nice question but scikit-learn
currently does not support neither kernel logistic regression nor the ANOVA kernel.
You can implement it though.
Example 1 for the ANOVA kernel:
import numpy as np
from sklearn.metrics.pairwise import check_pairwise_arrays
from scipy.linalg import cholesky
from sklearn.linear_model import LogisticRegression
def anova_kernel(X, Y=None, gamma=None, p=1):
X, Y = check_pairwise_arrays(X, Y)
if gamma is None:
gamma = 1. / X.shape[1]
diff = X[:, None, :] - Y[None, :, :]
diff **= 2
diff *= -gamma
np.exp(diff, out=diff)
K = diff.sum(axis=2)
K **= p
return K
# Kernel matrix based on X matrix of all data points
K = anova_kernel(X)
R = cholesky(K, lower=False)
# Define the model
clf = LogisticRegression()
# Here, I assume that you have split the data and here, train are the indices for the training set
clf.fit(R[train], y_train)
preds = clf.predict(R[test])¨
Example 2 for Nyström:
from sklearn.kernel_approximation import Nystroem
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
K_train = anova_kernel(X_train)
clf = Pipeline([
('nys', Nystroem(kernel='precomputed', n_components=100)),
('lr', LogisticRegression())])
clf.fit(K_train, y_train)
K_test = anova_kernel(X_test, X_train)
preds = clf.predict(K_test)
Upvotes: 13