katrasnikj
katrasnikj

Reputation: 3291

How do you run a half float ONNX model using ONNXRuntime C API?

Since the C language doesn't have a half float implementation, how do you send data to the ONNXRuntime C API?

Upvotes: 2

Views: 5940

Answers (2)

Scott McKay
Scott McKay

Reputation: 353

There's possibly an example you can follow linked from here: https://github.com/microsoft/onnxruntime/issues/1173#issuecomment-501088662

You can create a buffer to write the input data to using CreateTensorAsOrtValue, and access the buffer within the OrtValue using GetTensorMutableData.

ONNXRuntime is using Eigen to convert a float into the 16 bit value that you could write to that buffer.

uint16_t floatToHalf(float f) {
  return Eigen::half_impl::float_to_half_rtne(f).x;
}

Alternatively you could edit the model to add a Cast node from float32 to float16 so that the model takes float32 as input.

Upvotes: 2

KamilCuk
KamilCuk

Reputation: 141493

the C language doesn't have a half float implementation

Yes, but there are language extensions and you can write your own library to handle the data.

So, for example there is _Float16 type defined by ISO/IEC TS 18661-3:2015 supported by gcc on some architectures.

And you can write or find a library that will handle the half-floating point operations.

Upvotes: 1

Related Questions