Reputation: 71

Why tf.function traces layers twice?

The example code is as follows:

import tensorflow as tf     # tf-2.4 or tf-2.x
from datetime import datetime

# Define a layer with an eager side effect
class EagerLayer(tf.keras.layers.Layer):
  def __init__(self, **kwargs):
    super(EagerLayer, self).__init__(**kwargs)
    # Do some kind of initialization here
    self.dense = tf.keras.layers.Dense(32)

  def call(self, inputs):
    print("\nCurrently running eagerly", str(datetime.now()))
    return self.dense(inputs)

input_data = tf.random.uniform([60, 28, 28])
layer = EagerLayer()
tf_func_layer = tf.function(layer)
print("=============")
_ = tf_func_layer(input_data)

The output is

=============

Currently running eagerly 2020-12-16 20:46:38.482592

Currently running eagerly 2020-12-16 20:46:38.503605

The printing side effect shows up twice, which means the function is traced twice.

I am just wondering why the layer is traced twice.

A colab notebook can be found here.

Upvotes: 5

Answers (2)

DrPavel

Reputation: 21

I think that this is happening because of tf.keras.layers.Dense.

The keras implementation of a Dense layer postpones initialisation of weights and biases to the first call instead of at initialisation time. This is done so that you don't have to explicitly define input dimensions, which are instead derived from the input tensor.

The code snippet below will trace only once. This is because I explicitly defined all the weights and biases at initialisation time.

class Dense(tf.Module):
    """
        Fully connected dense layer.
    """

    def __init__(self, neurons_in: int, neurons_out: int, name: str = None) -> None:
        """
            Build the weights and bias variables at initialisation time.
            Calculate output tensor through matrix multiplication and add bias.

            args:
                neurons_in: int - Number of neurons in.
                neurons_out: int - Number of neurons for the fully connected layer.
                name: str - Name of the layer.
            returns:
                None
        """

        super(Dense, self).__init__(name = name)
        self._neurons_in = neurons_in
        self._neurons_out = neurons_out

        weights_shape = (neurons_in, neurons_out)
        self.W = tf.Variable(
            initial_value = tf.initializers.GlorotNormal()(weights_shape),
            trainable = True,
            name = "dense_weights",
            dtype = tf.float32
        )
        bias_shape = (neurons_out,)
        self.b = tf.Variable(
            initial_value = tf.initializers.GlorotNormal()(bias_shape),
            trainable = True,
            name = "dense_bias",
            dtype = tf.float32
        )

    @tf.function(input_signature=[tf.TensorSpec(shape= (None, 500), dtype= tf.float32)])
    def __call__(self, x_in: tf.Tensor) -> tf.Tensor:
        """
            args:
                x_in: tf.Tensor - Input tensor of dimension (None, input_size).
            returns:
                x_out: tf.Tensor - Output tensor of dimension (None, neurons).
        """

        x_out = tf.matmul(x_in, self.W) + self.b
        return x_out

Now consider a case where parameter initialisation is deferred to the first call of the layer, which is what Keras does.

class KerasDense(tf.Module):
    """
        Keras style fully connected dense layer.
    """

    def __init__(self, neurons: int, name: str = None) -> None:
        """
            Build the weights and bias variables on the first call.
            Calculate output tensor through matrix multiplication and add bias.

            args:
                neurons: int - Number of neurons for the fully connected layer.
                name: str - Name of the layer.
            returns:
                None
        """

        super(KerasDense, self).__init__(name = name)
        self._neurons = neurons

        # do not initialise weights at init
        self.W = None
        self.b = None

        # is built flag for dynamic input size inference
        self._is_built = False

    @tf.function(input_signature=[tf.TensorSpec(shape= (None, 500), dtype= tf.float32)])
    def __call__(self, x_in: tf.Tensor) -> tf.Tensor:
        """
            args:
                x_in: tf.Tensor - Input tensor of dimension (None, input_size).
            returns:
                x_out: tf.Tensor - Output tensor of dimension (None, neurons).
        """

        if not self._is_built:
            weights_shape = (x_in.shape[-1], self._neurons)

            self.W = tf.Variable(
                initial_value = tf.initializers.GlorotNormal()(weights_shape),
                trainable = True,
                name = "dense_weights",
                dtype = tf.float32
            )

            bias_shape = (self._neurons,)
            self.b = tf.Variable(
                initial_value = tf.initializers.GlorotNormal()(bias_shape),
                trainable = True,
                name = "dense_bias",
                dtype = tf.float32
            )

            self._is_built = True

        x_out = tf.matmul(x_in, self.W) + self.b
        return x_out

The Keras style dense layer will trace twice.

Note that I have subclassed tf.Module instead of tf.keras.Layer, but the latter is derived from the former anyway.

Hope this helps!

Upvotes: 1

DJ001

Reputation: 75

Switching on verbose gives the below output ~

=============
INFO:tensorflow:Converted call: <__main__.EagerLayer object at 0x7fbf6e34b860>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <built-in function hasattr>
    args: (<__main__.EagerLayer object at 0x7fbf6e34b860>, '_thread_local')
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._split_out_first_arg of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ((<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,), {})
    kwargs: None

INFO:tensorflow:Converted call: <function flatten at 0x7fbf86ceb598>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

INFO:tensorflow:Converted call: <function _in_functional_construction_mode at 0x7fbf82850840>
    args: (<__main__.EagerLayer object at 0x7fbf6e34b860>, <tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, (), {}, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>])
    kwargs: None

INFO:tensorflow:Converted call: <function call_context at 0x7fbf8276bf28>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <built-in function any>
    args: (<generator object outer_factory.<locals>.inner_factory.<locals>.tf____call__.<locals>.else_body_10.<locals>.<genexpr> at 0x7fbf6e07d4c0>,)
    kwargs: None

INFO:tensorflow:Converted call: <built-in function isinstance>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, (<class 'tensorflow.python.ops.numpy_ops.np_arrays.ndarray'>, <class 'numpy.ndarray'>, <class 'float'>, <class 'int'>))
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._get_input_masks of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>], (), {})
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._set_training_mode of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ((), {}, <tensorflow.python.keras.engine.base_layer_utils.CallContext object at 0x7fbf79a7c860>)
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._clear_losses of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <function executing_eagerly at 0x7fbf86cb58c8>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._maybe_cast_inputs of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>])
    kwargs: None

INFO:tensorflow:Converted call: <function assert_input_compatibility at 0x7fbf8276f2f0>
    args: (None, <tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, 'eager_layer_7')
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._name_scope of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._autographed_call of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._maybe_build of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

INFO:tensorflow:Converted call: <function EagerLayer.call at 0x7fbf6ccc1bf8>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <bound method EagerLayer.call of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <built-in method now of type object at 0xa33e60>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <class 'str'>
    args: (datetime.datetime(2020, 12, 31, 15, 15, 24, 101021),)
    kwargs: None


Currently running eagerly 2020-12-31 15:15:24.101021
INFO:tensorflow:Converted call: <tensorflow.python.keras.layers.core.Dense object at 0x7fbf6cccb198>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._set_save_spec of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

INFO:tensorflow:Converted call: <__main__.EagerLayer object at 0x7fbf6e34b860>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <built-in function hasattr>
    args: (<__main__.EagerLayer object at 0x7fbf6e34b860>, '_thread_local')
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._split_out_first_arg of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ((<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,), {})
    kwargs: None

INFO:tensorflow:Converted call: <function flatten at 0x7fbf86ceb598>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

INFO:tensorflow:Converted call: <function _in_functional_construction_mode at 0x7fbf82850840>
    args: (<__main__.EagerLayer object at 0x7fbf6e34b860>, <tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, (), {}, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>])
    kwargs: None

INFO:tensorflow:Converted call: <function call_context at 0x7fbf8276bf28>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <built-in function any>
    args: (<generator object outer_factory.<locals>.inner_factory.<locals>.tf____call__.<locals>.else_body_10.<locals>.<genexpr> at 0x7fbf6e3615c8>,)
    kwargs: None

INFO:tensorflow:Converted call: <built-in function isinstance>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, (<class 'tensorflow.python.ops.numpy_ops.np_arrays.ndarray'>, <class 'numpy.ndarray'>, <class 'float'>, <class 'int'>))
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._get_input_masks of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>], (), {})
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._set_training_mode of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ((), {}, <tensorflow.python.keras.engine.base_layer_utils.CallContext object at 0x7fbf79a7c860>)
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._clear_losses of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <function executing_eagerly at 0x7fbf86cb58c8>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._maybe_cast_inputs of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, [<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>])
    kwargs: None

INFO:tensorflow:Converted call: <function assert_input_compatibility at 0x7fbf8276f2f0>
    args: (None, <tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>, 'eager_layer_7')
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._name_scope of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <bound method Layer._autographed_call of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <function EagerLayer.call at 0x7fbf6ccfcd08>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <bound method EagerLayer.call of <__main__.EagerLayer object at 0x7fbf6e34b860>>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: {}

INFO:tensorflow:Converted call: <built-in method now of type object at 0xa33e60>
    args: ()
    kwargs: None

INFO:tensorflow:Converted call: <class 'str'>
    args: (datetime.datetime(2020, 12, 31, 15, 15, 24, 153569),)
    kwargs: None


Currently running eagerly 2020-12-31 15:15:24.153569
INFO:tensorflow:Converted call: <tensorflow.python.keras.layers.core.Dense object at 0x7fbf6cccb198>
    args: (<tf.Tensor 'self:0' shape=(60, 28, 28) dtype=float32>,)
    kwargs: None

I am assuming that the reason for it calling twice is for running set_save_spec, but still not sure...

Upvotes: 1

Why tf.function traces layers twice?

Answers (2)

Related Questions