Edison
Edison

Reputation: 4291

Change in preference value does not affect the results of Affinity propagation Clustering

Refer to the following code

import numpy as np
from sklearn.cluster import AffinityPropagation
from sklearn import metrics
from sklearn.datasets.samples_generator import make_blobs

##############################################################################
# Generate sample data
centers = [[1, 1], [-1, -1], [1, -1]]
X, labels_true = make_blobs(n_samples=300, centers=centers, cluster_std=0.5)

# Compute similarities
X_norms = np.sum(X ** 2, axis=1)
S = - X_norms[:, np.newaxis] - X_norms[np.newaxis, :] + 2 * np.dot(X, X.T)
p=[10 * np.median(S),np.mean(S,axis=1),np.mean(S,axis=0),100000,-100000]
##############################################################################

# Compute Affinity Propagation
for preference in p:
    af = AffinityPropagation().fit(S, preference)
    cluster_centers_indices = af.cluster_centers_indices_
    labels = af.labels_

    n_clusters_ = len(cluster_centers_indices)

    print('Estimated number of clusters: %d' % n_clusters_)
    print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels))
    print("Completeness: %0.3f" % metrics.completeness_score(labels_true, labels))
    print("V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels))
    print("Adjusted Rand Index: %0.3f" % \
          metrics.adjusted_rand_score(labels_true, labels))
    print("Adjusted Mutual Information: %0.3f" % \
          metrics.adjusted_mutual_info_score(labels_true, labels))
    D = (S / np.min(S))
    print("Silhouette Coefficient: %0.3f" %
          metrics.silhouette_score(D, labels, metric='precomputed'))

    ##############################################################################

    # Plot result
    import pylab as pl
    from itertools import cycle

    pl.close('all')
    pl.figure(1)
    pl.clf()

    colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')
    for k, col in zip(range(n_clusters_), colors):
        class_members = labels == k
        cluster_center = X[cluster_centers_indices[k]]
        pl.plot(X[class_members, 0], X[class_members, 1], col + '.')
        pl.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
                markeredgecolor='k', markersize=14)
        for x in X[class_members]:
            pl.plot([cluster_center[0], x[0]], [cluster_center[1], x[1]], col)

    pl.title('Estimated number of clusters: %d' % n_clusters_)
    pl.show()

Although I am changing the preference value in the loop but still I am getting the same clusters? So why change in preference value is not affecting clustering results?

Update

When I tried the following code the outcome is below

correct

When I tried the suggestion as recommended by Agost in the constructor then I got following output

enter image description here

Upvotes: 2

Views: 1469

Answers (3)

h elrefae
h elrefae

Reputation: 1

centers = [[1, 1], [-1, -1], [1, -1]]
X, labels_true = make_blobs(n_samples=300, centers=centers, cluster_std=0.5)

The above two lines for generating any dataset or I have to write them within the code and X is my own features

Upvotes: 0

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

The sklearn implementation of AP appears to be quite fragile.

My suggestions for using it:

  • use verbose=True to see when it failed to converge
  • increase the maximum number of iterations to at least 1000
  • reduce the damping by choosing 0.9 instead of 0.5

The reason is that with default parameters, sklearn's AP usually does not converge...

As mentioned by @AgostBiro before, preference is not a parameter of the fit function (but the constructor), so your original code ignored the preference, because fit(X,y) ignores y (it's a stupid API to have the dead y parameter, but sklearn likes that this looks like the classification API)

Upvotes: 3

Agost Biro
Agost Biro

Reputation: 2839

The preference is a parameter of the AffinityPropagation constructor not of the fit() method. You should change line 19 to:

af = AffinityPropagation(preference=preference).fit(S)

Upvotes: 1

Related Questions