Reputation: 25058
I want to use nursery data to train SVM (8 attributes and 5 classes), using same logic for C45 Learning class as seen on example:
In example, data is loaded from nursery data containing 8 attributes "parents", "has_nurs", "form", "children", "housing", "finance", "social", "health"
and combinations of these attributes result on one of 5 classes "not_recom","recommend", "very_recom","priority","spec_prior"
However I do not know whick Kernel would fit best this kind of SVM Data. As per definition polynomial kernel is a kernel function that represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models. I tried using this Kernel but got issues when training machine with data.
So far I used the code shown in example to train the SVM and used svm code like:
#//same code as C45 Example to get input and output data
string nurseryData = Resources.nursery;
string[] inputColumns =
{
“parents”, “has_nurs”, “form”, “children”,
“housing”, “finance”, “social”, “health”
};
string outputColumn = “output”;
DataTable table = new DataTable(“Nursery”);
table.Columns.Add(inputColumns);
table.Columns.Add(outputColumn);
string[] lines = nurseryData.Split(
new[] { Environment.NewLine }, StringSplitOptions.None);
foreach (var line in lines)
table.Rows.Add(line.Split(‘,’));
Codification codebook = new Codification(table);
DataTable symbols = codebook.Apply(table);
double[][] inputs = symbols.ToArray(inputColumns);
int[] outputs = symbols.ToArray(outputColumn);
int inputDimension = 8;
int outputClasses = 5;
#//SVM
IKernel kernel = new Polynomial(2, 5);
// Create the Multi-class Support Vector Machine using the selected Kernel
var ksvm = new MulticlassSupportVectorMachine(inputDimension, kernel, outputClasses);
// Create the learning algorithm using the machine and the training data
var ml = new MulticlassSupportVectorLearning(ksvm, inputs, outputs);
ml.Algorithm = (svm, classInputs, classOutputs, i, j) =>
new SequentialMinimalOptimization(svm, classInputs, classOutputs);
double SVMerror = ml.Run();
However I get an error while training the machine, What am I missing?
I now have other issue, trying Cesar's code I got this
Upvotes: 1
Views: 3492
Reputation: 2118
The framework automatically builds a kernel function cache to help speed up computations during SVM learning. However, there are cases that this cache may take too much memory and lead to OutOfMemoryExceptions.
To make a balance between memory consumption and CPU speed, set the CacheSize property to a lower value. The default is to store all input vectors in the cache; setting it to something lower (such as 1/20 the number of training samples) should suffice.
If you set CacheSize to zero, then you will disable the cache entirely. Training might be a bit slower, but you won't have any memory problems. Please take a look at the code below. The resulting error I got is around 0.09.
// same code to get input and output data
string nurseryData = Properties.Resources.nursery;
string[] inputColumns =
{
"parents", "has_nurs", "form", "children",
"housing", "finance", "social", "health"
};
string outputColumn = "output";
DataTable table = new DataTable("Nursery");
table.Columns.Add(inputColumns);
table.Columns.Add(outputColumn);
string[] lines = nurseryData.Split(
new[] { Environment.NewLine }, StringSplitOptions.None);
foreach (var line in lines)
table.Rows.Add(line.Split(','));
Codification codebook = new Codification(table);
DataTable symbols = codebook.Apply(table);
double[][] inputs = symbols.ToArray(inputColumns);
int[] outputs = Matrix.ToArray<int>(symbols, outputColumn);
//SVM
IKernel kernel = new Linear();
// Create the Multi-class Support Vector Machine using the selected Kernel
int inputDimension = inputs[0].Length;
int outputClasses = codebook[outputColumn].Symbols;
var ksvm = new MulticlassSupportVectorMachine(inputDimension, kernel, outputClasses);
// Create the learning algorithm using the machine and the training data
var ml = new MulticlassSupportVectorLearning(ksvm, inputs, outputs)
{
Algorithm = (svm, classInputs, classOutputs, i, j) =>
{
return new SequentialMinimalOptimization(svm, classInputs, classOutputs)
{
CacheSize = 0
};
}
};
double SVMerror = ml.Run(); // should be around 0.09
However, I agree that this might not be too obvious. I will add a better way to handle this case in a fix release. Thanks for posting your question!
Upvotes: 4