Reputation: 11
So ive developed an application using google MlKit and its object detection api. One thing I noticed that the tflite image classification model that im using, ( TFLite (efficientnet/lite3/int8) ) only outputs the probabilities of the classes that the object might fall in unlike most object detection models that also return bounding box values. However only image classification models are supported by the google mlkits object detection api. The api however somehow returns me the the boxding box values as well. My question is how does the api return the bounding box values when the tflite model itself only returns a list of probabilities of classes.
The api however somehow returns me the the boxding box values as well. My question is how does the api return the bounding box values when the tflite model itself only returns a list of probabilities of classes.
Upvotes: 1
Views: 127
Reputation: 356
The ML Kit Object Detection SDK pipeline contains two models: a detector model that detect objects in the image, AND a classification model that runs on the cropped image of the detected object. The pipeline also handles other works for you, e.g. caching result for performance improvement, tracking object with tracking ID in streaming mode, etc...
The detector model is internal that you cannot change. It is the source of the bounding boxes.
The part you can change via the Custom variant of the API is the classification model.
Upvotes: 0