Reputation: 1257
I am using ONNX
to export a model trained in Python and run it in C#.
Everything works rather well but I'd like to speed up the inference on the C# side, using Tasks
or Threads
on a CPU. I don't have a GPU available.
When running the code on a single image, the overall time on my PC is about 200ms, when running on 30 images, I get about 900-1000ms. Here's the interesting code (I made a self-contained repo to reproduce the issue, it is on my github).
private static void RunInferenceOnImages(List<Tensor<float>> inputs)
{
var tasks = new List<Task<int>>();
foreach (var image in inputs)
{
var input = new List<NamedOnnxValue>() {
NamedOnnxValue.CreateFromTensor<float>("float_input", image),
};
Task<int> task = Task<int>.Run(() => RunInference(modelPath, input));
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
foreach (var task in tasks)
{
var output = task.Result;
}
}
static int RunInference(string modelPath, List<NamedOnnxValue> inputs)
{
using var session = new InferenceSession(modelPath);
using var results = session.Run(inputs);
var scores = results.First().AsTensor<long>();
return (int)scores[0];
}
How would I go to make this faster?
Upvotes: 0
Views: 87