Christian
Christian

Reputation: 1257

Make ONNX inference faster in C#

I am using ONNX to export a model trained in Python and run it in C#.

Everything works rather well but I'd like to speed up the inference on the C# side, using Tasks or Threads on a CPU. I don't have a GPU available.

When running the code on a single image, the overall time on my PC is about 200ms, when running on 30 images, I get about 900-1000ms. Here's the interesting code (I made a self-contained repo to reproduce the issue, it is on my github).

    private static void RunInferenceOnImages(List<Tensor<float>> inputs)
    {
        var tasks = new List<Task<int>>();

        foreach (var image in inputs)
        {
            var input = new List<NamedOnnxValue>() {
                         NamedOnnxValue.CreateFromTensor<float>("float_input", image),
            };
            Task<int> task = Task<int>.Run(() => RunInference(modelPath, input));
            tasks.Add(task);
        }

        Task.WaitAll(tasks.ToArray());

        foreach (var task in tasks)
        {
            var output = task.Result;
        }
    }

    static int RunInference(string modelPath, List<NamedOnnxValue> inputs)
    {
        using var session = new InferenceSession(modelPath);
        using var results = session.Run(inputs);
        var scores = results.First().AsTensor<long>();
        return (int)scores[0];
    }

How would I go to make this faster?

Upvotes: 0

Views: 87

Answers (0)

Related Questions