Alexandre Tavares
Alexandre Tavares

Reputation: 113

Why am I getting different results when I use models with the same weights in different formats - \(.pt) \.onnx \(.bin, .xml)?

I have a model trained on YOLOv5s and is working fine.

This is an input image: Input image

I can get an expected result using pytorch after doing an inference:

inference output

This is an output image:

output image

The thing is, I need it in Openvino and regardless if I do the inference using the model in .onnx or .bin and .xml (for openvino) I won't get the expected inference result.

What I get is a vector with this shape (1, 25200, 6). I know that:

  1. 25200 is equal to 1x3x80x80 + 1x3x40x40 + 1x3x20x20;
  2. 6 = 1 class + 4 (x,y,w,h) + 1 (score);
  3. batch_size = 1

To export it, I used:

!python export.py --data models/custom_yolov5s.yaml --weights /content/bucket_11_03_2022.pt --batch-size 1 --device cpu --include openvino --imgsz 640

and to reproduce the issue I did in two ways:

  1. .onnx:
import cv2
image = cv2.imread('data/cropped.png')

# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (640, 640))

# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)

import onnxruntime as onnxrt

onnx_session= onnxrt.InferenceSession("models/bucket_11_03_2022.onnx")
onnx_inputs= {onnx_session.get_inputs()[0].name:input_image.astype(np.float32)}
onnx_output = onnx_session.run(None, onnx_inputs)
img_label = onnx_output[0]
print(onnx_output[0].shape)
  1. Openvino:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.inference_engine import IECore

ie = IECore()

net = ie.read_network(
    model="bucket_11_03_2022.xml",
    weights="bucket_11_03_2022.bin",
)
exec_net = ie.load_network(net, "CPU")

output_layer_ir = next(iter(exec_net.outputs))
input_layer_ir = next(iter(exec_net.input_info))

# Text detection models expects image in BGR format
image = cv2.imread("data/cropped.png")

# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = net.input_info[input_layer_ir].tensor_desc.dims

# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (W, H))

# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)

plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB));

result = exec_net.infer(inputs={input_layer_ir: input_image})

result['output'].shape

Could you guys help me to get the correct inference (bounding box with score) using .onnx or the IE format (openvino - .bin, .xml)?

The model files are here.

Upvotes: 1

Views: 2971

Answers (1)

Zul_Intel
Zul_Intel

Reputation: 66

Based on my replication, this issue occurred due to incorrect conversion from PyTorch to ONNX. I’ve found that the converted ONNX from the PyTorch model was able to detect the object (bucket) but did not reflect the correct label as it took one of the class names from coco128.yaml.

You may need to retrain your model by following the Train Custom Data. But I cannot guarantee this method will be successful as it is not validated by OpenVINO.

I suggest you post this issue in ultralytics GitHub forum. For your information, ultralytics is not a part of OpenVINO Toolkit.

Upvotes: 2

Related Questions