Kazamaa
Kazamaa

Reputation: 99

Set batch size of trained keras model to 1

I am having a keras model trained on my own dataset. However after loading weights the summary shows None as the first dimension(the batch size).

I want to know the process to fix the shape to batch size of 1, as it is compulsory for me to fix it so i can convert the model to tflite with GPU support.

Upvotes: 2

Views: 2315

Answers (3)

K. Bogdan
K. Bogdan

Reputation: 535

The answer of @maciek97x helped me to find an alternative for a similar problem but it did not work for a pre-trained model that I was trying to train in another scenario. Also, only changing the input shape layer was not enough. I tested with TF 2.15.0

As you mentioned that you trained the model with your own dataset, I am assuming that you have the model definition available. So, what worked for me was:

  1. Load the pre-trained model and your custom objects, if applicable.

    import re
    import tensorflow as tf
    
    # Model with old batch size
    pre_trained_model = tf.keras.models.load_model(path_to_model_folder, custom_objects={"SomeCustomLayer": SomeCustomLayer})
    
  2. Generate another model with the same definition but replacing operations outside the custom layers with TF versions, such as '+' and '*' with tf.add(..) and tf.multiply(...).

    batch_size = 1
    data_input_layer = tf.keras.layers.Input(
        shape=[128, 128, 3],
        batch_size=batch_size,
    )
    # model definition ...
    
    new_model = tf.keras.Model(
        inputs=[data_input_layer],
        outputs=[output],
    )
    
  3. Load the pre-trained weights to the new model

    layer_dict = dict([(layer.name, layer) for layer in pre_trained_model.layers])
    
    for i, layer in enumerate(model_full.layers):
        # Check if there is any weights for the layer
        if layer.get_weights():
            # Get name without '_1', '_2', that can happen when you create two models with the same definition        
            name_filtered = re.sub(r'_\d+', '', layer.name)
    
            pre_trained_weights = layer_dict[name_filtered].get_weights()
            new_model.get_layer(layer.name).set_weights(pre_trained_weights)
    

Upvotes: 0

maciek97x
maciek97x

Reputation: 7370

I had the same problem and cannot find solution anywhere. However, I managed to solve it with some workaround.

Firstly, you need to get config of your model:

model.get_config() 

There you see a batch_input_shape

{'name': 'sequential',
 'layers': [{'class_name': 'InputLayer',
   'config': {'batch_input_shape': (None, 28, 28, 1),
    'dtype': 'float32',
    'sparse': False,
    ...

Next, you need to change the None value to desired batch size. For example like this (changing to 1):

for layer in conf['layers']:
    if 'batch_input_shape' in layer['config']:
        shape = layer['config']['batch_input_shape']
        shape = (1, *shape[1:])
        layer['config']['batch_input_shape'] = shape

Now the config dict should look like this:

{'name': 'sequential',
 'layers': [{'class_name': 'InputLayer',
   'config': {'batch_input_shape': (1, 28, 28, 1),
    'dtype': 'float32',
    'sparse': False,
    ...

And the final step is to create a model from the new config and set weights from initial model.

new_model = model.from_config(conf)
new_model.set_weights(model.get_weights())

Looking at summary, we see it worked:

Model: "sequential"
_________________________________________________________________
 Layer (type)                  Output Shape            Param #   
=================================================================
 conv2d (Conv2D)               (1, 26, 26, 32)         320       
                                                                 
 max_pooling2d (MaxPooling2D)  (1, 13, 13, 32)         0         
                                                          

Upvotes: 1

Joe Mattioni
Joe Mattioni

Reputation: 99

What worked for me was to specify batch size to the Input layer, like this:

input = layers.Input(shape=input_shape, batch_size=1, dtype='float32', name='images')

This then carried through the rest of the layers.

The bad news is that despite this "fix" the tfl runtime still complains about dynamic tensors. I get these non-fatal errors in logcat when it runs:

E/tflite: third_party/tensorflow/lite/core/subgraph.cc:801 tensor.data.raw != nullptr was not true.
E/tflite: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors (tensor#26 is a dynamic-sized tensor).
E/tflite: Ignoring failed application of the default TensorFlow Lite delegate indexed at 0.

The good news is that despite these errors it seems to be using the GPU anyway, based on performance testing.

I'm using:

tensorflow-lite-support:0.2.0'    
tensorflow-lite-metadata:0.2.1'    
tensorflow-lite:2.6.0'    
tensorflow:tensorflow-lite-gpu:2.3.0'

Hopefully, they'll fix the runtime so it doesn't matter whether the batch size is 'None'. It shouldn't matter for doing inference.

Upvotes: 2

Related Questions