Make ONNX inference faster in C#

Question

I am using ONNX to export a model trained in Python and run it in C#.

Everything works rather well but I'd like to speed up the inference on the C# side, using Tasks or Threads on a CPU. I don't have a GPU available.

When running the code on a single image, the overall time on my PC is about 200ms, when running on 30 images, I get about 900-1000ms. Here's the interesting code (I made a self-contained repo to reproduce the issue, it is on my github).

    private static void RunInferenceOnImages(List> inputs)
    {
        var tasks = new List>();

        foreach (var image in inputs)
        {
            var input = new List() {
                         NamedOnnxValue.CreateFromTensor("float_input", image),
            };
            Task task = Task.Run(() => RunInference(modelPath, input));
            tasks.Add(task);
        }

        Task.WaitAll(tasks.ToArray());

        foreach (var task in tasks)
        {
            var output = task.Result;
        }
    }

    static int RunInference(string modelPath, List inputs)
    {
        using var session = new InferenceSession(modelPath);
        using var results = session.Run(inputs);
        var scores = results.First().AsTensor();
        return (int)scores[0];
    }

How would I go to make this faster?

Make ONNX inference faster in C#

Answers (0)

Related Questions