sundowatch
sundowatch

Reputation: 3103

Google Cloud AutoML prediction on Docker

I've trained a multi-class object detection model on Google Cloud AutoML. I've dowloaded my own model from Container export. Than I've deployed it on Docker with Google Cloud AutoML docker image. And I've send request with this code:

import base64
import io
import json
import requests


def process(image_file_path, image_key="1", port_number=8501):
    with io.open(image_file_path, 'rb') as image_file:
        encoded_image = base64.b64encode(image_file.read()).decode('utf-8')

    instances = {
            "instances": [
                    {
                        "image_bytes": {
                            "b64": str(encoded_image)
                        },
                        "key": image_key
                    }
            ]
    }

    url = 'http://localhost:{}/v1/models/default:predict'.format(port_number)

    response = requests.post(url, data=json.dumps(instances))
    return response.json()

I've successfully get the response from the docker as json format:

{
    "predictions": [{
        "detection_multiclass_scores": [
            [0.00540795922, 0.99754715], 
            ...
        ],
        "detection_classes": [1.0, ...],
        "num_detections": 40.0,
        "image_info": [320, 320, 1, 0, 320, 320],
        "detection_boxes": [
            [0.0382162929, 0.0984618068, 0.746192276, 0.991413414], 
            ...
        ],
        "detection_scores": [0.99754715, ...],
        "detection_classes_as_text": ["image_class", ...],
        "key": "1"
    }]
}

At this point, I want to know where is the detected bounding box in the image. I know that I should get this information with detection_boxes, but I need to convert it to px values. Because I'll process the bounding boxes again.

What is the pattern of the detection_boxes?

Upvotes: 1

Views: 287

Answers (1)

shortcipher3
shortcipher3

Reputation: 1380

The format of detection_boxes is [min_y, min_x, max_y, max_x], these values are normalized by the height and width of the image, so to get pixel coordinates y*height and x*width.

This is the same format as used by the Tensorflow Object Detection API, you can read about the format here

Upvotes: 2

Related Questions