Reputation: 1440
According to the Sklearn_extra documentation on KMedoids, KMedoids should have the following parameters: n_clusters
, metric
, method
, init
, max_iter
and random_state
. The method
parameter determines which algorithm to use: alternate
or pam
. According to sklearn_extra's user guide these methods are inherently different from each other. For my specific application I want to use the PAM version of K-medoids. However, the method
parameter seems to have disappeared. When I run an inspect on the KMedoids function:
import inspect
from sklearn_extra.cluster import KMedoids
I get the following output:
ArgSpec(args=['self', 'n_clusters', 'metric', 'init', 'max_iter', 'random_state'], varargs=None,
keywords=None, defaults=(8, 'euclidean', 'heuristic', 300, None))
Here, the method
parameter is also missing. In the code of KMedoids it appears to be still there. Does anyone know where the parameter has gone? I cannot find anything about it on the internet.
Upvotes: 1
Views: 2174
Reputation: 60390
There seems to be a discrepancy between the latest Github version (and the corresponding documentation) and the latest version available at PyPi (currently dated 29 March, 2020).
If we install from PyPi with pip
pip install scikit-learn-extra
and then inspect with
which actually gives the source code of the package in our machine, we get
class KMedoids(BaseEstimator, ClusterMixin, TransformerMixin):
"""k-medoids clustering.
Read more in the :ref:`User Guide <k_medoids>`.
n_clusters : int, optional, default: 8
The number of clusters to form as well as the number of medoids to
metric : string, or callable, optional, default: 'euclidean'
What distance metric to use. See :func:metrics.pairwise_distances
init : {'random', 'heuristic', 'k-medoids++'}, optional, default: 'heuristic'
Specify medoid initialization method. 'random' selects n_clusters
elements from the dataset. 'heuristic' picks the n_clusters points
with the smallest sum distance to every other point. 'k-medoids++'
follows an approach based on k-means++_, and in general, gives initial
medoids which are more separated than those generated by the other methods.
.. _k-means++:
max_iter : int, optional, default : 300
Specify the maximum number of iterations when fitting.
random_state : int, RandomState instance or None, optional
Specify random state for the random number generator. Used to
initialise medoids when init='random'.
def __init__(
self.n_clusters = n_clusters
self.metric = metric
self.init = init
self.max_iter = max_iter
self.random_state = random_state
i.e. the method
argument is indeed nowhere to be seen.
Installing from Github:
pip install git+
gives indeed
FullArgSpec(args=['self', 'n_clusters', 'metric', 'method', 'init', 'max_iter', 'random_state'], varargs=None, varkw=None, defaults=(8, 'euclidean', 'alternate', 'heuristic', 300, None), kwonlyargs=[], kwonlydefaults=None, annotations={})
It is interesting that, in both cases (PyPi and Github), the reported version is exactly the same (0.1.0b2
); something seems to have gone wrong here in terms of software development good practices...
Upvotes: 1
Reputation: 16916
The method
parameter is available in the the latest development version. So uninstall the existing version you have and install the latest directly from github using:
pip install
!pip install
from sklearn_extra.cluster import KMedoids
import numpy as np
X = np.asarray([[1, 2], [1, 4], [1, 0],
[4, 2], [4, 4], [4, 0]])
kmedoids = KMedoids(n_clusters=2, random_state=0, method="pam").fit(X)
array([0, 0, 0, 1, 1, 1])
Upvotes: 1