TensorFlow Lite and Android Things - Locating the detected Object and store them in RectF Objects?

Question

I have an Android Tablet where I installed the TensorFlow-Lite DetectorActivity in the examples that were available. It works well on an Android Tablet. However, when I tried to deploy it on a RaspberryPi 3 Model B that ran Android Things, it didn't run. There seemed to be an issue with configuring the camera properly in terms of enabling a live camera preview and running an analysis.

My original goal is to make an object detection app run on Android Things. It is also essential to draw the bounding rectangle on the detected objects.

I was looking for an example of an Android App that used TensorFlow-Lite and ran on Android Things. I quickly found this example from hackster.io that uses Image Classification to dispense candy. I ran it on my RaspberryPi Board, and it ran. It gives the results, the name of the object, the confidence level, as well as the ID. I was okay to build upon this sample code. Instead of a live camera feed, I could just make the app take a photo, analyze, and give the result. After which, it takes another photo and the cycle continues.

However, it did not specify the location in terms of a RectF Object.

What I tried to do was to adapt the recognizeFunction in the TFLite Android Example, it's in the TFLiteObjectDetectionAPIModel class. I adapted it to the doIdentification function of the Candy Dispenser Android App. My function now looks like this:

// outputLocations: array of shape [Batchsize, NUM_DETECTIONS,4]
// contains the location of detected boxes
private float[][][] outputLocations;
// outputClasses: array of shape [Batchsize, NUM_DETECTIONS]
// contains the classes of detected boxes
private float[][] outputClasses;
// outputScores: array of shape [Batchsize, NUM_DETECTIONS]
// contains the scores of detected boxes
private float[][] outputScores;
// numDetections: array of shape [Batchsize]
// contains the number of detected boxes
private float[] numDetections;

private static final int NUM_DETECTIONS = 10;

private static final float IMAGE_MEAN = 128.0f;
private static final float IMAGE_STD = 128.0f;



private void doIdentification(Bitmap image) {
    Log.e(TAG, "doing identification!");
    Trace.beginSection("recognizeImage");

    int numBytesPerChannel;
    if (TF_OD_API_IS_QUANTIZED) {
        Log.e(TAG, "model is quantized");
        numBytesPerChannel = 1; // Quantized
    } else {
        Log.e(TAG, "model is NOT quantized");
        numBytesPerChannel = 4; // Floating point
    }


    ByteBuffer imgData = ByteBuffer.allocateDirect(1 * TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT
            * 3 * numBytesPerChannel);

    Trace.beginSection("preprocessBitmap");
    // Preprocess the image data from 0-255 int to normalized float based
    // on the provided parameters.
    int[] intValues = new int[TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT];

    image.getPixels(intValues, 0, image.getWidth(), 0, 0, image.getWidth(), image.getHeight());

    imgData.rewind();
    for (int i = 0; i < TF_INPUT_IMAGE_HEIGHT; ++i) {
        for (int j = 0; j < TF_INPUT_IMAGE_HEIGHT; ++j) {
            int pixelValue = intValues[i * TF_INPUT_IMAGE_HEIGHT + j];

            if (TF_OD_API_IS_QUANTIZED) {
                imgData.put((byte) ((pixelValue >> 16) & 0xFF));
                imgData.put((byte) ((pixelValue >> 8) & 0xFF));
                imgData.put((byte) (pixelValue & 0xFF));
            } else {
                imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
                imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
                imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
            }

        }
    }
    Trace.endSection(); // preprocessBitmap


    // Allocate space for the inference results
    byte[][] confidencePerLabel = new byte[1][mLabels.size()];


    //for box detections
    // Copy the input data into TensorFlow.
    Trace.beginSection("feed");
    outputLocations = new float[1][NUM_DETECTIONS][4];
    outputClasses = new float[1][NUM_DETECTIONS];
    outputScores = new float[1][NUM_DETECTIONS];
    numDetections = new float[1];

    Object[] inputArray = {imgData};
    Map outputMap = new HashMap<>();
    outputMap.put(0, outputLocations);
    outputMap.put(1, outputClasses);
    outputMap.put(2, outputScores);
    outputMap.put(3, numDetections);
    Trace.endSection();


    // Read image data into buffer formatted for the TensorFlow model
    TensorFlowHelper.convertBitmapToByteBuffer(image, intValues, imgData);

    // Run inference on the network with the image bytes in imgData as input,
    // storing results on the confidencePerLabel array.

    Trace.beginSection("run");
    mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);
    Trace.endSection();

    // TODO - we try and fetch our rectF's here

    final ArrayList recognitions = new ArrayList<>(NUM_DETECTIONS);
    for (int i = 0; i < NUM_DETECTIONS; ++i) {
        final RectF detection =
                new RectF(
                        outputLocations[0][i][1] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][0] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][3] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][2] * TF_OD_API_INPUT_SIZE);
        // SSD Mobilenet V1 Model assumes class 0 is background class
        // in label file and class labels start from 1 to number_of_classes+1,
        // while outputClasses correspond to class index from 0 to number_of_classes
        int labelOffset = 1;

        Log.e(TAG, "adding the following to our results: ");
        Log.e(TAG, "recognition id: " + i);
        Log.e(TAG, "recognition label: " + mLabels.get((int) outputClasses[0][i] + labelOffset));
        Log.e(TAG, "recognition confidence: " + outputScores[0][i]);

        recognitions.add(
                new Recognition(
                        "" + i,
                        mLabels.get((int) outputClasses[0][i] + labelOffset),
                        outputScores[0][i],
                        detection));

    }
    Trace.endSection(); // "recognizeImage"


        // TODO -- This is the old working code
        // Get the results with the highest confidence and map them to their labels
        Collection results = TensorFlowHelper.getBestResults(confidencePerLabel, mLabels);
        Log.e(TAG, "results count is = " + results.size());

        // Report the results with the highest confidence
        onClassificationComplete(results);

}

I set the Quantized constant to true, and ran the code. However, I was greeted by the following error:

java.lang.IllegalArgumentException: Cannot convert between a TensorFlowLite tensor with type UINT8 and a Java object of type [[[F (which is compatible with the TensorFlowLite type FLOAT32).

and the line responsible is:

mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);

I tried changing to to just mTensorFlowLite.run, but that resulted in a different error.

Has anyone implemented Object Detection (drawing a RectF on the detected objects) on Android Things?

TensorFlow Lite and Android Things - Locating the detected Object and store them in RectF Objects?

Answers (1)

Related Questions