K-means in Matlab

Question

I have a Knowledge Base (KB) represented by a Matrix A=(100x15) and I have to clustering this KB into 5 cluster.

I used the code in Matlab:

idx=kmeans(A,5)

I obtained a result idx with the index of cluster for each row of matrix A.

Now I have a new vector B=(1x15) a sort of new entry and I have to clustering this new vector starting from the previous clustering obtained.

When I add the new entry B to the KB and I recall the function with C (composed by A and B)

idx1=kmeans(C,5)

I obtain a new idx1 with all results different from idx.

My scope is understand the cluster of B with respect to the cluster obtained clustering the KB.

Could you help me?

Thanks in advance.

Geoff · Accepted Answer

It sounds like you want to compare the new data point to the already-identified clusters. I'm not sure this will always give the results you expect, but you could just compute Euclidean distances to each cluster centroid and pick the smallest.

Example

Original data, constructed so as to have four clusters:

%// original data
A=[randn(25,1),   randn(25,1);
   randn(25,1)+5, randn(25,1);
   randn(25,1)+5, randn(25,1)+5;
   randn(25,1),   randn(25,1)+5];
plot(A(:,1),A(:,2),'k.');
hold on;

K-means clustering with K=4 clusters:

K=4;
[idx,centroids]=kmeans(A,K);
for n=1:K
    plot(A(idx==n,1),A(idx==n,2),'o');
end

Note that the second output of kmeans returns the centroid coordinates for each cluster.

Random new point:

%// new point:
B=2*randn(1,2);
plot(B(1),B(2),'rx');

Distance between new point and all centroids:

dist2cent = sqrt(sum((repmat(B,[K,1])-centroids).^2,2));

Index of smallest distance:

[~,closest] = min(dist2cent);

plot([centroids(closest,1), B(1)],...
     [centroids(closest,2), B(2)],...
     'r-');

K-means in Matlab

Answers (1)

Related Questions