Reputation: 123
I've found a perfect intro into ML.NET: https://www.codeproject.com/Articles/1249611/Machine-Learning-with-ML-Net-and-Csharp-VB-Net. It helped me to solve some questions with ML.NET.
But one of them still be actual:
When I send some text to the language detector (LanguageDetection example), I always receive a result. Even if classification is not confident for very short text fragment. Can I get information about confidence in multiclass classification? Or probability of belonging to some class to use it in the second algorithm pass which uses languages of neighbor sentences?
Upvotes: 4
Views: 4291
Reputation: 123
According to @Jon's cue, I modified the original example from CodeProject. This code can be found by the following link: https://github.com/sotnyk/LanguageDetector/tree/Code-for-stackoverflow-52536943
The main is (as suggested by Jon) adding the field:
public float[] Score;
into class ClassPrediction.
If this field exists, we received probabilities/confidences of multiclass classification per class.
But we have another difficulty with original example. It uses float values as a category label. But it is not indices in the score array. To map score indices to the categories, we should use the method TryGetScoreLabelNames:
if (!model.TryGetScoreLabelNames(out var scoreClassNames))
throw new Exception("Can't get score classes");
But this method does not work with class labels as float values. So I changed original .tsv files and fields ClassificationData.LanguageClass and ClassPrediction.Class to use string labels as class names.
Additional changes which not mentioned directly to the question subject:
Scores for every language printed in the application named Prediction. Now, this part of a code looks like follows:
internal static async Task<PredictionModel<ClassificationData, ClassPrediction>> PredictAsync(
string modelPath,
IEnumerable<ClassificationData> predicts = null,
PredictionModel<ClassificationData, ClassPrediction> model = null)
{
if (model == null)
{
new LightGbmArguments();
model = await PredictionModel.ReadAsync<ClassificationData, ClassPrediction>(modelPath);
}
if (predicts == null) // do we have input to predict a result?
return model;
// Use the model to predict the positive or negative sentiment of the data.
IEnumerable<ClassPrediction> predictions = model.Predict(predicts);
Console.WriteLine();
Console.WriteLine("Classification Predictions");
Console.WriteLine("--------------------------");
// Builds pairs of (sentiment, prediction)
IEnumerable<(ClassificationData sentiment, ClassPrediction prediction)> sentimentsAndPredictions =
predicts.Zip(predictions, (sentiment, prediction) => (sentiment, prediction));
if (!model.TryGetScoreLabelNames(out var scoreClassNames))
throw new Exception("Can't get score classes");
foreach (var (sentiment, prediction) in sentimentsAndPredictions)
{
string textDisplay = sentiment.Text;
if (textDisplay.Length > 80)
textDisplay = textDisplay.Substring(0, 75) + "...";
string predictedClass = prediction.Class;
Console.WriteLine("Prediction: {0}-{1} | Test: '{2}', Scores:",
prediction.Class, predictedClass, textDisplay);
for(var l = 0; l < prediction.Score.Length; ++l)
{
Console.Write($" {l}({scoreClassNames[l]})={prediction.Score[l]}");
}
Console.WriteLine();
Console.WriteLine();
}
Console.WriteLine();
return model;
}
}
Upvotes: 3