user2157806
user2157806

Reputation: 319

Bad results when testing libsvm in matlab

can someone help me to solve this? I want to test whether this classification is already good or not. So, I try with data testing=data training. it will give 100% (acc) if the classification is good. this is the code that I found from this site:

data= [170           66           ;
160            50           ;
170            63           ;
173            61           ;
168            58           ;
184            88           ;
189            94           ;
185            88           ]

labels=[-1;-1;-1;-1;-1;1;1;1];

numInst = size(data,1);
numLabels = max(labels);

 testVal = [1 2 3 4 5 6 7 8];
  trainLabel = labels(testVal,:);
  trainData = data(testVal,:);
  testData=data(testVal,:);
  testLabel=labels(testVal,:);
 numTrain = 8; numTest =8

%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -t 2 -g 0.2 -b 1');
end

%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    prob(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end


%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel)    %# accuracy
C = confusionmat(testLabel, pred)                   %# confusion matrix

and this is the results:

optimization finished, #iter = 16  
nu = 0.645259 obj = -2.799682, 
rho = -0.437644 nSV = 8, nBSV = 1 Total nSV = 8 
Accuracy = 100% (8/8) (classification)

acc =

    0.3750


C =

     0     5
     0     3

I dont know why there's two accuracy, and its different. the first one is 100% and the second one is 0.375. is my code false? it should be 100% not 37.5%. Can u help me to correct this code??

Upvotes: 0

Views: 1180

Answers (5)

Mo Farouk
Mo Farouk

Reputation: 41

I am sorry to tell that all answers are totally wrong!! The main error done in the code is:

numLabels = max(labels);

because it returns (1), although it should return 2 if the labels are positive numbers, and then svmtrain/svmpredict will loop twice.

Anyway, change line labels=[-1;-1;-1;-1;-1;1;1;1]; to labels=[2;2;2;2;2;1;1;1]; and it will work successfully ;)

Upvotes: 0

Pepe Mandioca
Pepe Mandioca

Reputation: 343

The problem is in

[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');

p does not contain the predicted labels, it has the probability estimates of the labels being correct. LIBSVM's svmpredict already calculates accuracy for you correctly, that's why it says 100% in the debug output. The fix is simple:

[p,~,~] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');

According to LIBSVM's Matlab bindings README:

The function 'svmpredict' has three outputs. The first one,
predictd_label, is a vector of predicted labels. The second output,
accuracy, is a vector including accuracy (for classification), mean
squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability
estimates (if '-b 1' is specified). If k is the number of classes
in training data, for decision values, each row includes results of 
predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
special case. Decision value +1 is returned for each testing instance,
instead of an empty vector. For probabilities, each row contains k values
indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field
in the model structure.

Upvotes: 0

user2934409
user2934409

Reputation: 11

C = confusionmat(testLabel, pred)
swap their positions

C= confusionmat(pred,testLabel)

or use this

[ConMat,order] = confusionmat(pred,testLabel)

shows the confusion matrix and the class order

Upvotes: 1

abner.ayala
abner.ayala

Reputation: 73

If your using libsvm then you should change the name of the MEX file since Matlab already has a svm toolbox with the name svmtrain. However, the code is running so it seems you did change the name just not on the code you provided.

The second one is wrong, don't know exactly why. However, I can tell you that you will almost always get 100% accuracy if you use the test_Data = training_Data. That result really does not mean anything, since the algorithm can be overfit and not be shown in your results. Test your algorithm against new data and that will give you a realistic accuracy.

Upvotes: 2

alrikai
alrikai

Reputation: 4194

Is that the code you're using? I don't think your svmtrain invocation is valid. You should have svmtrain(MAT, VECT, ...) where MAT is a matrix of data, and VECT is a vector with the labels of each row of MAT. The remaining parameters are string-value pairs, meaning you'll have a string identifier and its corresponding valie.

When I ran your code (Linux, R2011a) I got an error on the svmtrain call. Running with svmtrain(trainData, double(trainLabel==k)) gave a valid output (for that line). Of course, it appears that you're not using pure matlab, as your svmpredict call isn't native matlab, but rather a matlab binding from LIBSVM...

Upvotes: 1

Related Questions