Reputation: 39
Im using LIBSVM and MatLab to classify 34x5 data in 3 classes. I applied 10 fold Kfold cross validation method and RBF kernel. The output is this confusion matrix with 0.88 Correct rate (88 % accuracy). This is my confusion matrix
9 0 0
0 3 0
0 4 18
I would like to know what methods inside SVM to consider to improve the accuracy or other classifications method in Machine learning techniques. Any help?
Here is my SVM classification code
load Turn180SVM1; //load data file
libsvm_options = '-s 1 -t 2 -d 3 -r 0 -c 1 -n 0.1 -p 0.1 -m 100 -e 0.000001 -h 1 -b 0 -wi 1 -q';//svm options
C=size(Turn180SVM1,2);
% cross validation
for i = 1:10
indices = crossvalind('Kfold',Turn180SVM1(:,C),10);
cp = classperf(Turn180SVM1(:,C));
for j = 1:10
[X, Z] = find(indices(:,end)==j);%testing
[Y, Z] = find(indices(:,end)~=j);%training
feature_training = Turn180SVM1([Y'],[1:C-1]); feature_testing = Turn180SVM1([X'],[1:C-1]);
class_training = Turn180SVM1([Y'],end); class_testing = Turn180SVM1([X'], end);
% SVM Training
disp('training');
[feature_training,ps] = mapminmax(feature_training',0,1);
feature_training = feature_training';
feature_testing = mapminmax('apply',feature_testing',ps)';
model = svmtrain(class_training,feature_training,libsvm_options);
%
% SVM Prediction
disp('testing');
TestPredict = svmpredict(class_testing,sparse(feature_testing),model);
TestErrap = sum(TestPredict~=class_testing)./length(class_testing)*100;
cp = classperf(cp, TestPredict, X);
disp(((i-1)*10 )+j);
end;
end;
[ConMat,order] = confusionmat(TestPredict,class_testing);
cp.CorrectRate;
cp.CountingMatrix;
Upvotes: 2
Views: 6176
Reputation: 17026
Many methods exist. If your tuning procedure is optimal (e.g. well executed cross-validation) your choices include:
Improve preprocessing, perhaps tailor new aggregated features based on domain knowledge. Most importantly (and most effectively): make sure your inputs are standardized properly, for example by scaling every dimension onto [-1,1].
Use another kernel: RBF kernels are known to perform very well in a wide variety of settings, but specialised kernels exist for many tasks. Don't consider this unless you know what you are doing. Since you are dealing with a low-dimensional problem, RBF is probably a good choice if your data is not structured.
Reweigh training instances: particularly important when your data set is unbalanced (e.g. some classes have a lot less instances than others). You can do this with the -wX
options in libsvm. All sorts of reweighting schemes exist, including variants of boosting. I'm not a major fan of this, since such approaches are prone to overfitting.
Change the cross-validation cost function to suit your exact needs. Is accuracy really what you are looking for or do you want, say, high F1 or high ROC-AUC? It is surprising how many people optimize a performance measure they are not really interested in.
Upvotes: 3