michasaucer
michasaucer

Reputation: 5238

Training set has 0 instances, abort training exception

Im rebuilding my project to ML.NET 0.10. I get data from this link and its look like this (i saved it as .csv file in this way:

diagnosis;radius_mean;texture_mean;perimeter_mean;area_mean;smoothness_mean;compactness_mean;concavity_mean;concave points_mean;symmetry_mean;fractal_dimension_mean;radius_se;texture_se;perimeter_se;area_se;smoothness_se;compactness_se;concavity_se;concave points_se;symmetry_se;fractal_dimension_se;radius_worst;texture_worst;perimeter_worst;area_worst;smoothness_worst;compactness_worst;concavity_worst;concave points_worst;symmetry_worst;fractal_dimension_worst
B;11.62;18.18;76.38;408.8;0.1175;0.1483;0.102;0.05564;0.1957;0.07255;0.4101;1.74;3.027;27.85;0.01459;0.03206;0.04961;0.01841;0.01807;0.005217;13.36;25.4;88.14;528.1;0.178;0.2878;0.3186;0.1416;0.266;0.0927
B;9.667;18.49;61.49;289.1;0.08946;0.06258;0.02948;0.01514;0.2238;0.06413;0.3776;1.35;2.569;22.73;0.007501;0.01989;0.02714;0.009883;0.0196;0.003913;11.14;25.62;70.88;385.2;0.1234;0.1542;0.1277;0.0656;0.3174;0.08524

My Data class presents like this:

class CancerData
{
    [LoadColumn(0, 30), ColumnName("Features")]
    public float FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

Now, my Program.cs file:

var mlContext = new MLContext();
var trainData = mlContext.Data.ReadFromTextFile<CancerData>("Cancer-train.csv", 
                             hasHeader: true, 
                             separatorChar: ';');

var pipeline = mlContext.Transforms
                        .Normalize("Features")
                        .AppendCacheCheckpoint(mlContext)
            .Append(mlContext.BinaryClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Target", featureColumn: "Features"));

var model = pipeline.Fit(trainData);

var testData = mlContext.Data.ReadFromTextFile<CancerData>("Cancer-test.csv", 
                             hasHeader: true, 
                             separatorChar: ';');

var metrics = mlContext.BinaryClassification.Evaluate(model.Transform(testData), label: "Target");

From this code, i get an exception that says:

System.InvalidOperationException: 'Training set has 0 instances, aborting training.'

enter image description here

My question is, is my code is correct? My .csv files are in project folder and it works with ML.NET 0.5. Thanks for any advices!

Upvotes: 0

Views: 1315

Answers (2)

sogand
sogand

Reputation: 11

I had the same error but my problem was in my data part, in a wrong way i put separatorChar on (separatorChar: '/t') when i correct that to (separatorChar: ',') my problem was solved.

Upvotes: 1

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131601

LoadColumn(0, 30) specifies that the data is loaded from columns 0 to 30, and yet FeatureVector is a single float. It should be a float[] at least.

The first column though contains text data. It should be excluded from the FeatureVector array.

CancerData should probably look like this :

class CancerData
{
    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

If the diagnosis column is needed, it should be :

class CancerData
{
    [LoadColumn(0)]
    public string Diagnosis {get;set;}

    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

Upvotes: 2

Related Questions