Chris
Chris

Reputation: 31206

pyclustering: intended method of initializing kmeans

On wikipedia, there is a description of how to initialize the kmeans cluster locations according to a random method.

In pyclustering, a python clustering library, the various clusters are implemented with a high performance c-core. This core is faster than numpy/sklearn, so I want to avoid implementing anything in sklearn/numpy (or else I might lose the speedy feel of the code right now).

However, the kmeans class requires an initial cluster location list to get going. What is the intended method of initializing these cluster locations in pyclustering?

Upvotes: 1

Views: 1374

Answers (1)

annoviko
annoviko

Reputation: 136

There is automatically generated pyclustering documentation where API of kmeans algorithm is described.

For example, you have a 2D-data where two clusters should extracted, then you need to specify initial centers (pyclustering doesn't generate initial centers they should be provided by user):

kmeans_instance = kmeans(sample, [ [0.0, 0.1], [2.5, 2.6] ], ccore = True);

Where [0.0, 0.1] is a initial center on the first cluster and [2.5, 2.6] is a initial center of the second. Flag 'ccore = True' is for CCORE library usage.

Run processing:

kmeans_instance.process();

Obtain clustering result:

clusters = kmeans_instance.get_clusters(); # list of clusters
centers = kmeans_instance.get_centers(); # list of cluster centers.

Visualize obtained result:

visualizer = cluster_visualizer();
visualizer.append_clusters(clusters, sample);
visualizer.append_cluster(start_centers, marker = '*', markersize = 20);
visualizer.append_cluster(centers, marker = '*', markersize = 20);
visualizer.show();

Click here to see example of result visualization

Usage examples can be found in: 'pyclustering/cluster/example/kmeans_examples.py'

$ ls pyclustering/cluster/examples/ -1
__init__.py
agglomerative_examples.py
birch_examples.py
clarans_examples.py
cure_examples.py
dbscan_examples.py
dbscan_segmentation.py
general_examples.py
hsyncnet_examples.py
kmeans_examples.py  <--- kmeans examples
kmeans_segmentation.py
kmedians_examples.py
kmedoids_examples.py
optics_examples.py
rock_examples.py
somsc_examples.py
syncnet_examples.py
syncsom_examples.py
syncsom_segmentation.py
xmeans_examples.py

Upvotes: 2

Related Questions