Consuming an ML.NET trained model from the C++ OnnxRuntime

Question

I wrote a C# program to train an multi-classification model using Microsoft ML.NET. The training is successfully complete and I have exported the model as an ONNX file using the Microsoft.ML.OnnxConverter package.

I would like to consume the ONNX model from within a C++ program (running on x64-windows) for running the inference (prediction task).

The shape of the input and output in my model is:

Input:
  Features:              float 1x7
  code_point:            float 1x1
Output:
  Features.output:       float 1x7
  code_point.output:     float 1x1
  PredictedLabel.output: float 1x1
  Score.output:          float 1x94

Note: The code_point is uint32_t datatype as noted in the answer. I am leaving the question as is with this note included.

In the code for invoking the inference,

    constexpr size_t input_tensor_size = 8;
    std::vector input_tensor_values(input_tensor_size);

    // initialize the input_tensor_values
    ...

    // create input tensor object from data values
    auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
    auto input_tensor = Ort::Value::CreateTensor(
        memory_info, 
        input_tensor_values.data(), input_tensor_size,
        input_node_dims.data(), input_node_dims.size());

    std::vector output_node_names = { 
        "Features.output", "code_point.output", 
        "PredictedLabel.output", "Score.output" 
    };

    // score model & input tensor, get back output tensor
    auto output_tensors =
        session.Run(
          Ort::RunOptions{ nullptr }, 
          input_node_names.data(), 
          &input_tensor, 1, 
          output_node_names.data(), 1);

I am getting an access violation error upon invoking session.Run() and I am not able to figure out what the cause is. I suspect it has to do with either the input tensor being flattened into a 1x8 vector and passed to the function or the output_node_names shape length being passed as 1. I have tried setting that to 4 and that doesn't work either.

Could you please suggest the right sequence for initializing the tensors and calling the Run() function for the shape of the input/output given above?

vvg · Accepted Answer

I found out the mistake after @Botje pointed out the issue in a comment under the question.

First of all, there is a small error in the model. code_point is uint32_t datatype and not float. The correct model is

Input:
   Features               float    1x7,
   code_point             uint32_t 1x1
Output:
   Features               float    1x7,
   code_point.output      uint32_t 1x1,
   PredictedLabel.output  uint32_t 1x1,
   Score.output           float    1x94

Secondly, as @Botje pointed out, there are two inputs to the model viz., Features and code_point.

I created simple classes to hold the model input and output and pass it around:

struct model_input
{
public:
    std::vector  features;
    uint32_t            code_point;
public:
    model_input()
    {
        features.resize(7, 0.0f);
        code_point = 0u;
    }
};

struct model_output
{
public:
    std::vector  features;
    uint32_t            code_point;
    uint32_t            PredictedLabel;
    std::vector  Score;
public:
    model_output()
    {
        features.resize(7, 0.0f);
        code_point = 0u;
        PredictedLabel = 0u;
        Score.resize(94, 0.0f);
    }

};

The working sequence for the initialization and inference is as follows:

    // copy the test input values into "Features"
    model_input mdl_input;
    mdl_input.features = {
        0.204244,
        0.0475028,
        -0.00872255,
        -0.0037717,
        -0.0122744,
        0.0262117,
        -0.000971803
    };
    mdl_input.code_point = 44u;
    model_output mdl_output;

    Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);

    std::vector inputTensors;

    //Features is float:7
    inputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, mdl_input.features.data(), mdl_input.features.size(), inputDims[0].data(), inputDims[0].size()));

    //code_point is uint32_t:1
    inputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, &mdl_input.code_point, 1, inputDims[1].data(), inputDims[1].size()));

    std::vector outputTensors;

    // Features.output is float:7
    outputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, mdl_output.features.data(), mdl_output.features.size(), outputDims[0].data(), outputDims[0].size()));

    // code_point.output is uint32_t:1
    outputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, &mdl_output.code_point, 1, outputDims[1].data(), outputDims[1].size()));

    // PredictedLabel is uint32_t:1
    outputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, &mdl_output.PredictedLabel, 1, outputDims[2].data(), outputDims[2].size()));

    // Score is float:94
    outputTensors.push_back(Ort::Value::CreateTensor(memoryInfo, mdl_output.Score.data(), mdl_output.Score.size(), outputDims[3].data(), outputDims[3].size()));

    // names are hard-coded!
    std::vector input_names_ptrs =
    {
        "Features",
        "code_point"
    };

    std::vector output_names_ptrs =
    {
        "Features.output",
        "code_point.output",
        "PredictedLabel.output",
        "Score.output"
    };

    session.Run(
        Ort::RunOptions{ nullptr }, 
        input_names_ptrs.data(),
        inputTensors.data(),
        inputTensors.size(),  //Number of inputs 
        output_names_ptrs.data(),
        outputTensors.data(),
        outputTensors.size()   //Number of outputs
    );

    std::cout << "expected: " << mdl_input.code_point << ", predicted: " << mdl_output.code_point << std::endl;

After fixing this, the program generated the output:

expected: 44, predicted: 44

Consuming an ML.NET trained model from the C++ OnnxRuntime

Answers (1)

Related Questions