Speeding up inference of Keras models

Question

I have a Keras model which is doing inference on a Raspberry Pi (with a camera). The Raspberry Pi has a really slow CPU (1.2.GHz) and no CUDA GPU so the model.predict() stage is taking a long time (~20 seconds). I'm looking for ways to reduce that by as much as possible. I've tried:

Overclocking the CPU (+ 200 MhZ) and got a few extra seconds of performance.
Using float16's instead of float32's.
Reducing the image input size as much as possible.

Is there anything else I can do to increase the speed during inference? Is there a way to simplify a model.h5 and take a drop in accuracy? I've had success with simpler models, but for this project I need to rely on an existing model so I can't train from scratch.

F&#225;bio Perez · Accepted Answer

VGG16 / VGG19 architecture is very slow since it has lots of parameters. Check this answer.

Before any other optimization, try to use a simpler network architecture.

Google's MobileNet seems like a good candidate since it's implemented on Keras and it was designed for more limited devices.

If you can't use a different network, you may compress the network with pruning. This blog post specifically do pruning with Keras.

Speeding up inference of Keras models

Answers (2)

Related Questions