Converting tf.keras model to TFLite: Model is slow and doesn't work with XNN Pack

Question

Until recently I had been training a model using TF 1.15 based on MobileNetV2.

After training I had always been able to run these commands to generate a TFLite version:

    tf.keras.backend.set_learning_phase(0)

    converter = tf.lite.TFLiteConverter.from_keras_model_file(
        tf_keras_path)

    converter.optimizations = [tf.lite.Optimize.DEFAULT]

    converter.target_spec.supported_types = [
        tf.lite.constants.FLOAT16]
    tflite_model = converter.convert()

The resulting model was fast enough for our needs and when our Android developer used XNN Pack, we got an extra 30% reduction in inference time.

More recently I've developed a replacement model using TF2.4.1, based on the built-in keras implementation of efficientnet-b2.

This new model has larger input image size ((260,260) vs (224,224)) and its keras inference time is about 1.5x that of the older model.

However, when I convert to TFLite using these commands:

converter = tf.lite.TFLiteConverter.from_keras_model(newest_v3)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()

there are a number of problems:

inference time is 5x slower than the old model
on Android our developer sees this error: "Failed to apply XNNPACK delegate: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors."
When I run the conversion I can see a message that says: "function_optimizer: function_optimizer did nothing. time = 9.854ms."

I have also tried saving as SavedModel and converting the saved model.

Another attempt I made was to use the command line tool with these arguments (and as far as I recall, pretty much every permutation of arguments possible):

tflite_convert --saved_model_dir newest_v3/ \
    --enable_v1_converter \
    --experimental_new_converter True \
    --input-shape=1,260,260,3 \
    --input-array=input_1:0 \
    --post_training_quantize \
    --quantize_to_float16 \
    --output_file newest_v3d.tflite\
    --allow-custom-ops

If anyone can shed some light onto what's going on here I'd be very grateful.

Converting tf.keras model to TFLite: Model is slow and doesn't work with XNN Pack

Answers (1)

Related Questions

Converting tf.keras model to TFLite: Model is slow and doesn&#39;t work with XNN Pack

Answers (1)

Related Questions

Converting tf.keras model to TFLite: Model is slow and doesn't work with XNN Pack