edgarmtze
edgarmtze

Reputation: 25058

Accord.NET multiclass SVM classification Kernel how to solve Out of memory exception

I want to use nursery data to train SVM (8 attributes and 5 classes), using same logic for C45 Learning class as seen on example:

In example, data is loaded from nursery data containing 8 attributes "parents", "has_nurs", "form", "children", "housing", "finance", "social", "health" and combinations of these attributes result on one of 5 classes "not_recom","recommend", "very_recom","priority","spec_prior"

However I do not know whick Kernel would fit best this kind of SVM Data. As per definition polynomial kernel is a kernel function that represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models. I tried using this Kernel but got issues when training machine with data.

So far I used the code shown in example to train the SVM and used svm code like:

#//same code as C45 Example to get input and output data

string nurseryData = Resources.nursery;
string[] inputColumns =
{
“parents”, “has_nurs”, “form”, “children”,
“housing”, “finance”, “social”, “health”
};
string outputColumn = “output”;
DataTable table = new DataTable(“Nursery”);
table.Columns.Add(inputColumns);
table.Columns.Add(outputColumn);
string[] lines = nurseryData.Split(
new[] { Environment.NewLine }, StringSplitOptions.None);
foreach (var line in lines)
     table.Rows.Add(line.Split(‘,’));
Codification codebook = new Codification(table);
DataTable symbols = codebook.Apply(table);
double[][] inputs = symbols.ToArray(inputColumns);
int[] outputs = symbols.ToArray(outputColumn);
int inputDimension = 8;
int outputClasses = 5;

#//SVM

IKernel kernel = new Polynomial(2, 5);
// Create the Multi-class Support Vector Machine using the selected Kernel
var ksvm = new MulticlassSupportVectorMachine(inputDimension, kernel, outputClasses);
// Create the learning algorithm using the machine and the training data
var ml = new MulticlassSupportVectorLearning(ksvm, inputs, outputs);
ml.Algorithm = (svm, classInputs, classOutputs, i, j) =>
        new SequentialMinimalOptimization(svm, classInputs, classOutputs);
double SVMerror = ml.Run();

However I get an error while training the machine, What am I missing?

enter image description here

EDIT

I now have other issue, trying Cesar's code I got this

enter image description here

Upvotes: 1

Views: 3492

Answers (1)

Cesar
Cesar

Reputation: 2118

The framework automatically builds a kernel function cache to help speed up computations during SVM learning. However, there are cases that this cache may take too much memory and lead to OutOfMemoryExceptions.

To make a balance between memory consumption and CPU speed, set the CacheSize property to a lower value. The default is to store all input vectors in the cache; setting it to something lower (such as 1/20 the number of training samples) should suffice.

If you set CacheSize to zero, then you will disable the cache entirely. Training might be a bit slower, but you won't have any memory problems. Please take a look at the code below. The resulting error I got is around 0.09.

// same code to get input and output data
string nurseryData = Properties.Resources.nursery;

string[] inputColumns =
{
    "parents", "has_nurs", "form", "children",
    "housing", "finance", "social", "health"
};

string outputColumn = "output";

DataTable table = new DataTable("Nursery");
table.Columns.Add(inputColumns);
table.Columns.Add(outputColumn);

string[] lines = nurseryData.Split(
    new[] { Environment.NewLine }, StringSplitOptions.None);

foreach (var line in lines)
    table.Rows.Add(line.Split(','));


Codification codebook = new Codification(table);

DataTable symbols = codebook.Apply(table);

double[][] inputs = symbols.ToArray(inputColumns);
int[] outputs = Matrix.ToArray<int>(symbols, outputColumn);

//SVM
IKernel kernel = new Linear();

// Create the Multi-class Support Vector Machine using the selected Kernel
int inputDimension = inputs[0].Length;
int outputClasses = codebook[outputColumn].Symbols;
var ksvm = new MulticlassSupportVectorMachine(inputDimension, kernel, outputClasses);

// Create the learning algorithm using the machine and the training data
var ml = new MulticlassSupportVectorLearning(ksvm, inputs, outputs)
{
    Algorithm = (svm, classInputs, classOutputs, i, j) =>
    {
        return new SequentialMinimalOptimization(svm, classInputs, classOutputs)
        {
            CacheSize = 0
        };
    }
};

double SVMerror = ml.Run(); // should be around 0.09

However, I agree that this might not be too obvious. I will add a better way to handle this case in a fix release. Thanks for posting your question!

Upvotes: 4

Related Questions