Reputation: 1440
According to the Sklearn_extra documentation on KMedoids, KMedoids should have the following parameters: n_clusters
, metric
, method
, init
, max_iter
and random_state
. The method
parameter determines which algorithm to use: alternate
or pam
. According to sklearn_extra's user guide these methods are inherently different from each other. For my specific application I want to use the PAM version of K-medoids. However, the method
parameter seems to have disappeared. When I run an inspect on the KMedoids function:
import inspect
from sklearn_extra.cluster import KMedoids
inspect.getargspec(KMedoids)
I get the following output:
ArgSpec(args=['self', 'n_clusters', 'metric', 'init', 'max_iter', 'random_state'], varargs=None,
keywords=None, defaults=(8, 'euclidean', 'heuristic', 300, None))
Here, the method
parameter is also missing. In the code of KMedoids it appears to be still there. Does anyone know where the parameter has gone? I cannot find anything about it on the internet.
Upvotes: 1
Views: 2174
Reputation: 60390
There seems to be a discrepancy between the latest Github version (and the corresponding documentation) and the latest version available at PyPi (currently dated 29 March, 2020).
If we install from PyPi with pip
,
pip install scikit-learn-extra
and then inspect with
print(inspect.getsource(KMedoids))
which actually gives the source code of the package in our machine, we get
class KMedoids(BaseEstimator, ClusterMixin, TransformerMixin):
"""k-medoids clustering.
Read more in the :ref:`User Guide <k_medoids>`.
Parameters
----------
n_clusters : int, optional, default: 8
The number of clusters to form as well as the number of medoids to
generate.
metric : string, or callable, optional, default: 'euclidean'
What distance metric to use. See :func:metrics.pairwise_distances
init : {'random', 'heuristic', 'k-medoids++'}, optional, default: 'heuristic'
Specify medoid initialization method. 'random' selects n_clusters
elements from the dataset. 'heuristic' picks the n_clusters points
with the smallest sum distance to every other point. 'k-medoids++'
follows an approach based on k-means++_, and in general, gives initial
medoids which are more separated than those generated by the other methods.
.. _k-means++: https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf
max_iter : int, optional, default : 300
Specify the maximum number of iterations when fitting.
random_state : int, RandomState instance or None, optional
Specify random state for the random number generator. Used to
initialise medoids when init='random'.
and
def __init__(
self,
n_clusters=8,
metric="euclidean",
init="heuristic",
max_iter=300,
random_state=None,
):
self.n_clusters = n_clusters
self.metric = metric
self.init = init
self.max_iter = max_iter
self.random_state = random_state
i.e. the method
argument is indeed nowhere to be seen.
Installing from Github:
pip install git+https://github.com/scikit-learn-contrib/scikit-learn-extra.git
and
inspect.getfullargspec(KMedoids)
gives indeed
FullArgSpec(args=['self', 'n_clusters', 'metric', 'method', 'init', 'max_iter', 'random_state'], varargs=None, varkw=None, defaults=(8, 'euclidean', 'alternate', 'heuristic', 300, None), kwonlyargs=[], kwonlydefaults=None, annotations={})
It is interesting that, in both cases (PyPi and Github), the reported version is exactly the same (0.1.0b2
); something seems to have gone wrong here in terms of software development good practices...
Upvotes: 1
Reputation: 16916
The method
parameter is available in the the latest development version. So uninstall the existing version you have and install the latest directly from github using:
pip install https://github.com/scikit-learn-contrib/scikit-learn-extra/archive/master.zip
!pip install https://github.com/scikit-learn-contrib/scikit-learn-extra/archive/master.zip
from sklearn_extra.cluster import KMedoids
import numpy as np
X = np.asarray([[1, 2], [1, 4], [1, 0],
[4, 2], [4, 4], [4, 0]])
kmedoids = KMedoids(n_clusters=2, random_state=0, method="pam").fit(X)
kmedoids.labels_
Output:
array([0, 0, 0, 1, 1, 1])
Upvotes: 1