Reputation: 51
I'm working on a problem on Matlab according to Matrix. I think my code could be improved by remove the for loop. But I really don't know how to fix this one. Can anyone help me, please? the code is:
K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
for i = 1:K
ids = (idx == i);
centroids(i,:) = sum(bsxfun(@times, X, ids))./ sum(ids);
end
in this code, data points X is 4x2. There are K=3 centroids, so the centroids is a matrix of 3x2. This code is part of a K-mean function, which is using data points and their closest centroids to find new position of centroids. I want to make the code as something without the FOR loop, maybe beginning like this:
ids = bsxfun(@eq, idx, 1:K);
centroids = ..............
Upvotes: 3
Views: 221
Reputation: 8459
You can avoid the bsxfun
by using logical indexing, this seems to be a worthwhile performance increase, at least for small matrices X
. It is best for small K
, and for a small number of rows of X
.
K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
centroids=zeros(K,2);
for i = 1:K
ids = (idx == i);
centroids(i,:) = sum(X(ids,:),1)./sum(ids);
end
If X
has a large number of rows, this method is fastest:
K = 3;
X = [1 2; 3 4; 5 6; 7 8];
idx = [1;2;3;1];
centroids=zeros(K,2);
t=bsxfun(@eq,idx,1:K);
centroids=bsxfun(@rdivide,t.'*X,sum(t).');
And if K
is very large, Luis' accumarray
method is fastest.
Upvotes: 4
Reputation: 112679
You could apply accumarray
. Note that accumarray
only works when X
is a column. So, if X
has two columns, you can call accumarray
twice:
centroids(:,1) = accumarray(idx, X(:,1), [], @mean)
centroids(:,2) = accumarray(idx, X(:,2), [], @mean)
Alternatively, if X
contains two columns of real numbers, you can use complex
to "pack" the two columns into one complex column, and then unpack the results:
centroids = accumarray(idx, complex(X(:,1),X(:,2)), [], @mean);
centroids = [ real(centroids) imag(centroids)];
If X
has an arbitrary number of columns, possibly with complex numbers, you can loop over columns:
centroids = NaN(K, size(X,2)); %// preallocate
for col = 1:size(X,2);
centroids(:,col) = accumarray(idx, X(:,col), [], @mean);
end
Upvotes: 4