bespectacled
bespectacled

Reputation: 2901

Matlab debugging - beginner level

I am a total beginner in Matlab and trying to write some Machine Learning Algorithms in Matlab. I would really appreciate it if someone can help me in debugging this code.

function y = KNNpredict(trX,trY,K,X)
   % trX is NxD, trY is Nx1, K is 1x1 and X is 1xD
   % we return a single value 'y' which is the predicted class

% TODO: write this function
% int[] distance = new int[N];
distances = zeroes(N, 1);
examples = zeroes(K, D+2);
i = 0;
% for(every row in trX) { // taking ONE example
for row=1:N, 
 examples(row,:) = trX(row,:);
 %sum = 0.0;
 %for(every col in this example) { // taking every feature of this example
 for col=1:D, 
    % diff = compute squared difference between these points - (trX[row][col]-X[col])^2
    diff =(trX(row,col)-X(col))^2;
    sum += diff;
 end % for
 distances(row) = sqrt(sum);
 examples(i:D+1) = distances(row);
 examples(i:D+2) = trY(row:1);
end % for

% sort the examples based on their distances thus calculated
sortrows(examples, D+1);
% for(int i = 0; i < K; K++) {
% These are the nearest neighbors
pos = 0;
neg = 0;
res = 0;
for row=1:K,
    if(examples(row,D+2 == -1))
        neg = neg + 1;
    else
        pos = pos + 1;
    %disp(distances(row));
    end
end % for

if(pos > neg)
    y = 1;
    return;
else
    y = -1;
    return;
end
end
end

Thanks so much

Upvotes: 0

Views: 340

Answers (1)

Amro
Amro

Reputation: 124543

When working with matrices in MATLAB, it is usually better to avoid excessive loops and instead use vectorized operations whenever possible. This will usually produce faster and shorter code.

In your case, the k-nearest neighbors algorithm is simple enough and can be well vectorized. Consider the following implementation:

function y = KNNpredict(trX, trY, K, x)
    %# euclidean distance between instance x and every training instance
    dist = sqrt( sum( bsxfun(@minus, trX, x).^2 , 2) );

    %# sorting indices from smaller to larger distances
    [~,ord] = sort(dist, 'ascend');

    %# get the labels of the K nearest neighbors
    kTrY = trY( ord(1:min(K,end)) );

    %# majority class vote
    y = mode(kTrY);
end

Here is an example to test it using the Fisher-Iris dataset:

%# load dataset (data + labels)
load fisheriris
X = meas;
Y = grp2idx(species);

%# partition the data into training/testing
c = cvpartition(Y, 'holdout',1/3);
trX = X(c.training,:);
trY = Y(c.training);
tsX = X(c.test,:);
tsY = Y(c.test);

%# prediction
K = 10;
pred = zeros(c.TestSize,1);
for i=1:c.TestSize
    pred(i) = KNNpredict(trX, trY, K, tsX(i,:));
end

%# validation
C = confusionmat(tsY, pred)

The confusion matrix of the kNN prediction with K=10:

C =
    17     0     0
     0    16     0
     0     1    16

Upvotes: 2

Related Questions