gustavo martins
gustavo martins

Reputation: 83

Tflite custom object detection model for flutter application using google_mlkit_object_detection package

I'm trying to create a custom object detection model in tflite format so that I can use it in a flutter application with the google_mlkit_object_detection package.

The first few models, I created using yolov8 and converted to tflite. The annotations were made with roboflow and the google colab notebook I used was provided by them, the metadata of the converted tflite model looks like the following image

enter image description here

On this model I was getting the error

Input tensor has type kTfLiteFloat32: it requires specifying NormalizationOptions metadata to preprocess input images.

So as suggested I tried to change the metadata and add normalizationOptions but failed to do so. My second alternative was to train a model with the official TensorFlow google colab notebook TensorFlow Lite Model Maker and it generated a model with the following metadata

enter image description here

For this model the error was

Unexpected number of dimensions for output index 1: got 3D, expected either 2D (BxN with B=1) or 4D

So I checked the model from the example app from the package I am using "google_mlkit_object_detection" and the metadata looks like this

enter image description here

So my question is, how can I alter the models I already trained whichever it is easier, to look like this, both input and output, do I have to alter my model's architecture or just the metadata? The second one trained with the official notebook from tensor flow, it seems that all I have to do is include the correct shape format [1,N], but again I might have to change the architecture.

Upvotes: 2

Views: 2188

Answers (3)

Chathura Chamikara
Chathura Chamikara

Reputation: 114

For on-device object detection, I would suggest you to use tflite_flutter.

What ever you use, you will need to normalize the image inputs. You can use something like this;

import 'package:image/image.dart' as img; 
   
 Future<Uint8List> imageToByteListFloat32(
      img.Image image, int inputSize, double mean, double std) async {
    var convertedBytes = Float32List(1 * inputSize * inputSize * 3);
    var buffer = Float32List.view(convertedBytes.buffer);
    int pixelIndex = 0;
    for (var i = 0; i < inputSize; i++) {
      for (var j = 0; j < inputSize; j++) {
        var pixel = image.getPixel(j, i);
        buffer[pixelIndex++] = (pixel.r - mean) / std;
        buffer[pixelIndex++] = (pixel.g - mean) / std;
        buffer[pixelIndex++] = (pixel.b - mean) / std;
      }
    }

    return convertedBytes.buffer.asUint8List();
  }

Then you will have to figure out the outputs. Use something like neutron to identify the output shape. So in your first YOLOv8 model, output shape is [1,13,13125]. 1- batch size, 13 - first 4 represents the bounding box coordinates (x,y,width,height) and remains denote the each class score (I assume you have 9 classes in total.), 13125 - represents the possible number of bounding boxes. So you will have to loop each an every box and filter what you need.

Upvotes: 0

Norbert Fucsk&#243;
Norbert Fucsk&#243;

Reputation: 36

The Custom Models page on the Google documentation says this:

Note: ML Kit only supports custom image classification models. Although AutoML Vision allows training of object detection models, these cannot be used with ML Kit.

So there it is, you can use custom object detection models, but only if they are image classification models. I thought this must be impossible, an object detection model outputs bounding boxes, while a classification model outputs class scores. However, I tried with the YOLOv8 model and the standard object detection model wouldn't work, but the classification model with the [1, 1000] output shape actually works with the Google MLKit example application and you can extract the bounding boxes from it.

I'm not 100% sure how this can work, but what I suspect is that there is a default object detector bundled with the package, which identifies where there could be objects, and then you can only modify the classification model on top of it.

Anyways the simple answer is: Use a classification model with a [1, N] or [1, 1, 1, N] output where N is the number of classes. If you have a model with a different architecture, then you should change the output to this format, otherwise it is not supposed to work.

Upvotes: 2

Steven
Steven

Reputation: 356

Metadata is just for providing the information of the model. Beside adding metadata, you need to make sure your model really meet the requirements.

More details about how to get such a model can be found here: https://developers.google.com/ml-kit/custom-models

The ML Kit Object Detection custom model is a classifier model.

Upvotes: 0

Related Questions