Norhther
Norhther

Reputation: 500

GridSearchCV with custom Kernel

I have the following code:

import numpy as np
from sklearn import svm
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

def tanimotoKernel(xs, ys):
    a = 0
    b = 0
    for x, y in zip(xs, ys):
        a += min(x, y)
        b += max(x, y)
    return a / b

def tanimotoLambdaKernel(xs,ys, gamma = 0.01):
    return np.exp(gamma * tanimotoKernel(xs,ys)) / (np.exp(gamma) - 1)

def GramMatrix(X1, X2, K_function=tanimotoLambdaKernel):
    gram_matrix = np.zeros((X1.shape[0], X2.shape[0]))
    for i, x1 in enumerate(X1):
        for j, x2 in enumerate(X2):
            gram_matrix[i, j] = K_function(x1, x2)
    return gram_matrix

X, y = datasets.load_iris(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y)
clf.fit(x_train, y_train)
accuracy_score(clf.predict(x_test), y_test)

clf = svm.SVC(kernel=GramMatrix)

However, I would like to be able to tune the gamma parameter of tanimotoLambdaKernel with GridSearchCV, because I don't want to have to manually test the parameter, check for the accuracy, etc.

Is there any way to do this?

Upvotes: 0

Views: 268

Answers (1)

Ben Reiniger
Ben Reiniger

Reputation: 12602

This doesn't seem to be possible directly; the builtin kernels' parameters are all baked in.

One approach is to pass different kernels themselves. This is a little involved because of your nested functions defining the kernel, so I use partial:

from functools import partial
param_space = {
    kernel: [
        partial(
            GramMatrix,
            K_function=partial(
                tanimotoLambdaKernel,
                gamma=g,
            )
        )
        for g in <your list of gammas>
    ]
}

The other approach that comes to mind is a custom class. This is cleaner in the hyperparameter search because "the hyperparameter" can be just gamma, but potentially more work in maintaining the class. In this case I avoid overriding __init__ by reusing the gamma parameter, and I set the kernel at fit time so that set_params works properly for gamma.

class SVC_tanimoto(svm.SVC):
    """SVC with a Tanimoto kernel."""
    def fit(self, X, y, sample_weight=None):
        self.kernel = partial(
            GramMatrix,
            K_function=partial(
                tanimotoLambdaKernel,
                gamma=self.gamma,
            )
        )
        super().fit(X, y, sample_weight=sample_weight)
        return self

Upvotes: 2

Related Questions