Reputation: 21

The ML.net prediction has HUGE different compared with Custom Vision

I've trained a model(object detection) using Azure Custom Vision, and export the model as ONNX, then import the model to my WPF(.net core) project.

I use ML.net to get prediction from my model, And I found the result has HUGE different compared with the prediction I saw on Custom Vision.

I've tried different order of extraction (ABGR, ARGB...etc), but the result is very disappointed, can any one give me some advice as there are not so much document online about Using Custom Vision's ONNX model with WPF to do object detection.

Here's some snippet:

        // Model creation and pipeline definition for images needs to run just once, so calling it from the constructor:
        var pipeline = mlContext.Transforms
            .ResizeImages(
                resizing: ImageResizingEstimator.ResizingKind.Fill,
                outputColumnName: MLObjectDetectionSettings.InputTensorName,
                imageWidth: MLObjectDetectionSettings.ImageWidth,
                imageHeight: MLObjectDetectionSettings.ImageHeight,
                inputColumnName: nameof(MLObjectDetectionInputData.Image))
            .Append(mlContext.Transforms.ExtractPixels(
                colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Rgb,
                orderOfExtraction: ImagePixelExtractingEstimator.ColorsOrder.ABGR,
                outputColumnName: MLObjectDetectionSettings.InputTensorName))
            .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelPath, outputColumnName: MLObjectDetectionSettings.OutputTensorName, inputColumnName: MLObjectDetectionSettings.InputTensorName));

        //Create empty DataView. We just need the schema to call fit()
        var emptyData = new List<MLObjectDetectionInputData>();
        var dataView = mlContext.Data.LoadFromEnumerable(emptyData);

        //Generate a model.
        var model = pipeline.Fit(dataView);

Then I use the model to create context.

            //Create prediction engine.
            var predictionEngine = _mlObjectDetectionContext.Model.CreatePredictionEngine<MLObjectDetectionInputData, MLObjectDetectionPrediction>(_mlObjectDetectionModel);

            //Load tag labels.
            var labels = File.ReadAllLines(LABELS_OBJECT_DETECTION_FILE_PATH);

            //Create input data.
            var imageInput = new MLObjectDetectionInputData { Image = this.originalImage };


            //Predict.
            var prediction = predictionEngine.Predict(imageInput);

Upvotes: 2

Answers (2)

Tomer Dror

Reputation: 81

Maybe because the aspect ratio is not preserved during the resize.

Try with an image with the size of:

MLObjectDetectionSettings.ImageWidth * MLObjectDetectionSettings.ImageHeight

And you will see much better results.

I think Azure does preliminary processing on the image, maybe Padding (also during training?), or Cropping.

Maybe during the processing it also uses a moving window(the size that the model expects) and then do some aggregation

Upvotes: 0

Arif

Reputation: 11

Can you check on the image input (imageInput) is resized with the same size as in the model requirements when you prepare the pipeline for both Resize parameters: imageWidth: MLObjectDetectionSettings.ImageWidth, imageHeight: MLObjectDetectionSettings.ImageHeight.

Also for the ExtractPixels parameters especially on the ColorBits and ColorsOrder should follow the model requirements.

Hope this help

Arif

Upvotes: 0

The ML.net prediction has HUGE different compared with Custom Vision

Answers (2)

Related Questions