Arpan Srivastava
Arpan Srivastava

Reputation: 31

Interpretation of yolov5 output

I am making a face mask detection project and I trained my model using ultralytics/yolov5.I saved the trained model as an onnx file, you can find the model file here model.onnx. Now I want you use this model.onnx with opencv to detect real time face mask. The input image size during training was 320*320. You can visualize this model using netron. I have written this code to capture the image using webcam and pass it to model.onnx to predict my bounding boxes. The code is as follows:

def predict(img):
    session = onnxruntime.InferenceSession(model_path)
    input_name = session.get_inputs()[0].name
    output_name = session.get_outputs()[0].name
    img = img.reshape((1,3,320,320))
    data = json.dumps({'data':img.tolist()})
    data = np.array(json.loads(data)['data']).astype('float32')
    result = session.run([output_name],{input_name:data})
    result = np.array(result)
    print(result.shape)

The output of result.shape is (1, 1, 3, 40, 40, 85) Can anyone help me in interpreting this shape and how can i use this result array to predict my class, bounding box and confidence.

Upvotes: 3

Views: 3572

Answers (1)

Grant Allan
Grant Allan

Reputation: 319

I've never worked with a pure yolov5 model, but here's the output format for yolov5s. It looks like it should be similar.

ouput tensor structure (yolov5s):
output_tensor[a, b, c, d]
    a -> image index (If you're input is a batch of images, this tells you which image's output you're looking at. If your input is just one image, leave this as 0.)
    b -> index of image in batch
    c -> information about bounding box
        0, 1 -> x and y coordinate of bounding box center
        2, 3 -> width and height of bounding box
        4 -> bounding box confidence
        5 - 85 -> single class confidences
    d -> index of proposed bounding boxes

Upvotes: 0

Related Questions