How do you run a half float ONNX model using ONNXRuntime C API?

Question

Since the C language doesn't have a half float implementation, how do you send data to the ONNXRuntime C API?

Scott McKay · Accepted Answer

There's possibly an example you can follow linked from here: https://github.com/microsoft/onnxruntime/issues/1173#issuecomment-501088662

You can create a buffer to write the input data to using CreateTensorAsOrtValue, and access the buffer within the OrtValue using GetTensorMutableData.

ONNXRuntime is using Eigen to convert a float into the 16 bit value that you could write to that buffer.

uint16_t floatToHalf(float f) {
  return Eigen::half_impl::float_to_half_rtne(f).x;
}

Alternatively you could edit the model to add a Cast node from float32 to float16 so that the model takes float32 as input.

How do you run a half float ONNX model using ONNXRuntime C API?

Answers (2)

Related Questions