Reputation: 7343
I have an Android Tablet where I installed the TensorFlow-Lite DetectorActivity in the examples that were available. It works well on an Android Tablet. However, when I tried to deploy it on a RaspberryPi 3 Model B that ran Android Things, it didn't run. There seemed to be an issue with configuring the camera properly in terms of enabling a live camera preview and running an analysis.
My original goal is to make an object detection app run on Android Things. It is also essential to draw the bounding rectangle on the detected objects.
I was looking for an example of an Android App that used TensorFlow-Lite and ran on Android Things. I quickly found this example from hackster.io that uses Image Classification to dispense candy. I ran it on my RaspberryPi Board, and it ran. It gives the results, the name of the object, the confidence level, as well as the ID. I was okay to build upon this sample code. Instead of a live camera feed, I could just make the app take a photo, analyze, and give the result. After which, it takes another photo and the cycle continues.
However, it did not specify the location in terms of a RectF Object.
What I tried to do was to adapt the recognizeFunction
in the TFLite Android Example, it's in the TFLiteObjectDetectionAPIModel class. I adapted it to the doIdentification
function of the Candy Dispenser Android App. My function now looks like this:
// outputLocations: array of shape [Batchsize, NUM_DETECTIONS,4]
// contains the location of detected boxes
private float[][][] outputLocations;
// outputClasses: array of shape [Batchsize, NUM_DETECTIONS]
// contains the classes of detected boxes
private float[][] outputClasses;
// outputScores: array of shape [Batchsize, NUM_DETECTIONS]
// contains the scores of detected boxes
private float[][] outputScores;
// numDetections: array of shape [Batchsize]
// contains the number of detected boxes
private float[] numDetections;
private static final int NUM_DETECTIONS = 10;
private static final float IMAGE_MEAN = 128.0f;
private static final float IMAGE_STD = 128.0f;
private void doIdentification(Bitmap image) {
Log.e(TAG, "doing identification!");
Trace.beginSection("recognizeImage");
int numBytesPerChannel;
if (TF_OD_API_IS_QUANTIZED) {
Log.e(TAG, "model is quantized");
numBytesPerChannel = 1; // Quantized
} else {
Log.e(TAG, "model is NOT quantized");
numBytesPerChannel = 4; // Floating point
}
ByteBuffer imgData = ByteBuffer.allocateDirect(1 * TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT
* 3 * numBytesPerChannel);
Trace.beginSection("preprocessBitmap");
// Preprocess the image data from 0-255 int to normalized float based
// on the provided parameters.
int[] intValues = new int[TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT];
image.getPixels(intValues, 0, image.getWidth(), 0, 0, image.getWidth(), image.getHeight());
imgData.rewind();
for (int i = 0; i < TF_INPUT_IMAGE_HEIGHT; ++i) {
for (int j = 0; j < TF_INPUT_IMAGE_HEIGHT; ++j) {
int pixelValue = intValues[i * TF_INPUT_IMAGE_HEIGHT + j];
if (TF_OD_API_IS_QUANTIZED) {
imgData.put((byte) ((pixelValue >> 16) & 0xFF));
imgData.put((byte) ((pixelValue >> 8) & 0xFF));
imgData.put((byte) (pixelValue & 0xFF));
} else {
imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
}
}
}
Trace.endSection(); // preprocessBitmap
// Allocate space for the inference results
byte[][] confidencePerLabel = new byte[1][mLabels.size()];
//for box detections
// Copy the input data into TensorFlow.
Trace.beginSection("feed");
outputLocations = new float[1][NUM_DETECTIONS][4];
outputClasses = new float[1][NUM_DETECTIONS];
outputScores = new float[1][NUM_DETECTIONS];
numDetections = new float[1];
Object[] inputArray = {imgData};
Map<Integer, Object> outputMap = new HashMap<>();
outputMap.put(0, outputLocations);
outputMap.put(1, outputClasses);
outputMap.put(2, outputScores);
outputMap.put(3, numDetections);
Trace.endSection();
// Read image data into buffer formatted for the TensorFlow model
TensorFlowHelper.convertBitmapToByteBuffer(image, intValues, imgData);
// Run inference on the network with the image bytes in imgData as input,
// storing results on the confidencePerLabel array.
Trace.beginSection("run");
mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);
Trace.endSection();
// TODO - we try and fetch our rectF's here
final ArrayList<Recognition> recognitions = new ArrayList<>(NUM_DETECTIONS);
for (int i = 0; i < NUM_DETECTIONS; ++i) {
final RectF detection =
new RectF(
outputLocations[0][i][1] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][0] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][3] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][2] * TF_OD_API_INPUT_SIZE);
// SSD Mobilenet V1 Model assumes class 0 is background class
// in label file and class labels start from 1 to number_of_classes+1,
// while outputClasses correspond to class index from 0 to number_of_classes
int labelOffset = 1;
Log.e(TAG, "adding the following to our results: ");
Log.e(TAG, "recognition id: " + i);
Log.e(TAG, "recognition label: " + mLabels.get((int) outputClasses[0][i] + labelOffset));
Log.e(TAG, "recognition confidence: " + outputScores[0][i]);
recognitions.add(
new Recognition(
"" + i,
mLabels.get((int) outputClasses[0][i] + labelOffset),
outputScores[0][i],
detection));
}
Trace.endSection(); // "recognizeImage"
// TODO -- This is the old working code
// Get the results with the highest confidence and map them to their labels
Collection<Recognition> results = TensorFlowHelper.getBestResults(confidencePerLabel, mLabels);
Log.e(TAG, "results count is = " + results.size());
// Report the results with the highest confidence
onClassificationComplete(results);
}
I set the Quantized constant to true, and ran the code. However, I was greeted by the following error:
java.lang.IllegalArgumentException: Cannot convert between a TensorFlowLite tensor with type UINT8 and a Java object of type [[[F (which is compatible with the TensorFlowLite type FLOAT32).
and the line responsible is:
mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);
I tried changing to to just mTensorFlowLite.run
, but that resulted in a different error.
Has anyone implemented Object Detection (drawing a RectF on the detected objects) on Android Things?
Upvotes: 2
Views: 1546
Reputation: 63293
The issue here is likely you are mixing models. There is a difference between image classification and object detection models. Classification simply reports the confidence of a certain type of object within the image and detection adds on identifying the object's location. The candy dispenser sample you are starting from uses an image classification model (mobilenet_quant_v1_224.tflite
) whereas the TFLite sample you mentioned runs an object detection model (mobilenet_ssd.tflite
).
I would recommend starting from the sample that does object detection and work through the camera issues rather than solving the problem the other way around. The candy dispenser sample (as well as the official image classifier sample) provide a good reference for getting the camera on the RPi3 to capture an image and converting it for use with the model.
Upvotes: 1