TensorFlow Custom Layer: Get the actual Batch Size

Question

I would like to implement a custom tf layer that performs a mathematical operation involving the actual batch-size of the input tensor:

import tensorflow as tf
from   tensorflow import keras

class MyLayer(keras.layers.Layer):

    def build(self, input_shape):
        self.batch_size = input_shape[0]
        super().build(input_shape)

    def call(self,input):
        self.batch_size + 1 # do something with the batch size
        return input

However, when building a graph, its value is initially None, which breaks the functionality in MyLayer:

input = keras.Input(shape=(10,))
x     = MyLayer()(input)

TypeError: in user code:

    :11 call  *
        self.batch_size + 1 # do something with the batch size

    TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Is there any way to make such layers work after the model has been constructed?

o-90 · Accepted Answer

Use tf.shape to grab the batch size inside your layer's call method.

Example:

import tensorflow as tf


# custom layer
class MyLayer(tf.keras.layers.Layer):
    def __init__(self):
        super().__init__()
        
    def call(self, x):
        bs = tf.shape(x)[0]
        return x, tf.add(bs, 1)
    
    
# network
x_in = tf.keras.Input(shape=(None, 10,))
x = MyLayer()(x_in)

# model def
model = tf.keras.models.Model(x_in, x)

# forward pass
_, shp = model(tf.random.normal([5, 10]))

# shape value
print(shp)
# tf.Tensor(6, shape=(), dtype=int32)

TensorFlow Custom Layer: Get the actual Batch Size

Answers (1)

Related Questions