Reputation: 336
I trained and quantized a Tensorflow model on a Ubuntu 18.04 machine and I converted it to tflite format. Then I deployed it on a Linux Yocto board equipped with a NPU accelerator, tflite_runtime and NNAPI. I noticed that the same tflite model outputs different predictions when using the CPU on my PC and the NPU+NNAPI on the board for inference. The predictions are often similar, but in some cases they are completely different. I tried to disable NNAPI on the board and to make inference using the CPU and the results were the same as on the PC CPU. So I think that the problem is the NNAPI. However, I don't know why this happens. Is there a way to prevent it or to make the network more robust during training?
Upvotes: 0
Views: 544
Reputation: 41
firion,
The NNAPI team is also interested in learning more.
Small variations are to be expected but completely different results should not be.
Have you tried on different devices? Do you see the same variations?
You mentioned a Linux Yocto build. Are you running your test using Android on Yocto, or using a Linux build of NNAPI?
Upvotes: 0