Reputation: 4052
Until recently I had been training a model using TF 1.15 based on MobileNetV2.
After training I had always been able to run these commands to generate a TFLite version:
tf.keras.backend.set_learning_phase(0)
converter = tf.lite.TFLiteConverter.from_keras_model_file(
tf_keras_path)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [
tf.lite.constants.FLOAT16]
tflite_model = converter.convert()
The resulting model was fast enough for our needs and when our Android developer used XNN Pack, we got an extra 30% reduction in inference time.
More recently I've developed a replacement model using TF2.4.1, based on the built-in keras implementation of efficientnet-b2.
This new model has larger input image size ((260,260) vs (224,224)) and its keras inference time is about 1.5x that of the older model.
However, when I convert to TFLite using these commands:
converter = tf.lite.TFLiteConverter.from_keras_model(newest_v3)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()
there are a number of problems:
I have also tried saving as SavedModel and converting the saved model.
Another attempt I made was to use the command line tool with these arguments (and as far as I recall, pretty much every permutation of arguments possible):
tflite_convert --saved_model_dir newest_v3/ \
--enable_v1_converter \
--experimental_new_converter True \
--input-shape=1,260,260,3 \
--input-array=input_1:0 \
--post_training_quantize \
--quantize_to_float16 \
--output_file newest_v3d.tflite\
--allow-custom-ops
If anyone can shed some light onto what's going on here I'd be very grateful.
Upvotes: 1
Views: 1371
Reputation: 1201
Tensorflow-lite does currently support tensors with dynamic shapes(enabled by default and explicitly by the "experimental_new_converter True" option on your conversion) but this issue below points out that XNNPack does not:
https://github.com/tensorflow/tensorflow/issues/42491
As XNNPack is not able to optimize the graph of the EfficientNet model you are not getting a boost in performance making the inference 5 times slower than before instead of just around 1.5 times.
Personally I would just recommend to move to EfficientNet-lite as it's the mobile/TPU counterpart of EfficientNet and was designed taking into account the restricted sets of operations available in Tensorflow-lite:
https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html
Upvotes: 2