Reputation: 1917
Someone posted a similar question here but I couldn't get my job done
see
Sklearn kNN usage with a user defined metric
I want to define my user_metric and use it in KNN.
I have a signature problem it seems but I don't understand it. thanks
gamma=2
def mydist2 (x,y):
z=(x-y)
return (z[0]^2+gamma*z[1]^2)
neigh = KNeighborsClassifier(n_neighbors=3,metric=mydist2)
neigh.fit(traindata,train_labels)
neigh.score(testdata,test_labels)
def mydist2 (x,y):ValueError Traceback (most recent call last) <ipython-input-81-f934c7b5c9b3> in <module>()
→ 1 neigh.fit(traindata,train_labels)
2 neigh.score(testdata,test_labels)C:\Users\Fagui\Anaconda2\lib\site-packages\sklearn\neighbors\base.pyc
in fit(self, X, y)
801 self._y = self._y.ravel()
802
803 return self._fit(X)
804
805C:\Users\Fagui\Anaconda2\lib\site-packages\sklearn\neighbors\base.pyc
in fit(self, X)
256 self.tree = BallTree(X, self.leaf_size,
257 metric=self.effective_metric,
--> 258 **self.effective_metric_params)
259 elif self._fit_method == 'kd_tree':
260 self._tree = KDTree(X, self.leaf_size,sklearn/neighbors/binary_tree.pxi in sklearn.neighbors.ball_tree.BinaryTree.init (sklearn\neighbors\ball_tree.c:8381)()
sklearn/neighbors/dist_metrics.pyx in sklearn.neighbors.dist_metrics.DistanceMetric.get_metric
(sklearn\neighbors\dist_metrics.c:4032)()sklearn/neighbors/dist_metrics.pyx in sklearn.neighbors.dist_metrics.PyFuncDistance.init
(sklearn\neighbors\dist_metrics.c:10628)()ValueError: func must be a callable taking two arrays
as a bonus question, I'd like to pass gamma as an argument
thanks very much
Upvotes: 1
Views: 3721
Reputation: 66
Define a metric in Cython, build the module to create the library and call it from your main code.
Sklearn is optimized and use cython and several process to run as fast as possible. Writing pure python code especially when it is called several times will slow your code. I recommend that you write your custom metric using cython. You have a tutorial that you can follow right here
Upvotes: -1
Reputation: 1917
my question was very stupid
the syntax was correct
the problem is that exponentiation in python is not with ^ but with **
hence 16=2**4 instead of 2^4
Upvotes: 2
Reputation: 2399
From KNeighborsClassifier documentation : the metric
argument must be a string or DistanceMetric Object and you gave a function.
In order to pass your own metric you have to specify : metric='pyfunc'
and add the keyword argument func=mydist2
.
In the similar question : they explain that a custom metric can only be used when algorithm='ball_tree'
is set and you kept the default which is 'auto'.
I think that the following should work:
neigh = KNeighborsClassifier(n_neighbors=3, algorithm='ball_tree',metric='pyfunc', func=mydist2)
When it comes to pass gamma as an argument I would try :
def mydist2 (x,y, gamma=2):
z=(x-y)
return (z[0]^2+gamma*z[1]^2)
and add the argument metric_params={'gamma':2}
neigh = KNeighborsClassifier(n_neighbors=3, algorithm='ball_tree',metric='pyfunc', func=mydist2, metric_params={'gamma':2} )
But I'm not sure, there are no clear example in the doc.
Upvotes: 2