R__
R__

Reputation: 87

in creating a custom layer, when the build method is called in Keras

Sorry, I am new to deep learning and keras. I am trying to define a layer myself.

I looked into the keras document, https://keras.io/api/layers/base_layer/#layer-class

class SimpleDense(Layer):

  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),
        trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(
        initial_value=b_init(shape=(self.units,), dtype='float32'),
        trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      return tf.matmul(inputs, self.w) + self.b

# Instantiates the layer.
linear_layer = SimpleDense(4)

I understand when I create linear_layer, the __init__ method is called, and when I put inputs into linear_layer, the call method is called. But I don’t get when the build method is called, more specifically, how is input_shape in build method specified? What is the input_shape here? I don’t know when the build method is called so I don’t know what arguments are put in as input_shape argument.

Besides, I want to specify a parameter with a fixed size, which is (1,768) in my case. So in this case, should I still use input_shape in build method?

Upvotes: 2

Views: 1967

Answers (1)

I'mahdi
I'mahdi

Reputation: 24049

To know about this SimpleDense layer and answer your questions, we need to explain weight and bias. weight in SimpleDense first gets random numbers and bias gets zero numbers and in the training of the model, this weight and bias change to minimize the loss. The answer to First Question: The build method only one-time calls, and in the first use of layer, this method is calling, and The weight and bias are set to random and zero numbers but The call method in each training batch is calling. The answer to the Second Question: Yes, in the call methods, we have access to a batch of data and the first dimension shows the batch. I write an example that print when the build and call method is calling and print the shape of input and output data to clarify the above explanation.

In the below example :

  1. I use batch_size = 5 and 25 sample data, and in each epoch, we can see in the call method access to 5 sample data.
  2. one-time layer create and build and one-time build method is calling, 5 epoch and 5-time call method is calling.
  3. Units = 4 and shape data = (100, 2) [sample, features] then total params = 12 <-> 4*2 (weights*features) + 4 (bias)
  4. Add the end, attach one image that shows how is matmul working and why the output shape is (5,4), and the formula for computing of intput*weight+bias.
import tensorflow as tf

class SimpleDense(tf.keras.layers.Layer):
  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    tf.print('calling build method')
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(initial_value=b_init(shape=(self.units,), 
                                              dtype='float32'),trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      tf.print('\ncalling call method')
      tf.print(f'input shape : {inputs.shape}')
      out = tf.matmul(inputs, self.w) + self.b
      tf.print(f'output shape : {out.shape}')
      return out

model = tf.keras.Sequential()
model.add(SimpleDense(units = 4))
model.compile(optimizer = 'adam',loss = 'mse',)
model.fit(tf.random.uniform((25, 2)), tf.ones((25, 1)), batch_size = 5)
model.summary()

Output:

calling build method

calling call method
input shape : (5, 2)
output shape : (5, 4)
1/5 [=====>........................] - ETA: 1s - loss: 0.9794
calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)
5/5 [==============================] - 0s 15ms/step - loss: 0.9770
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 simple_dense (SimpleDense)  (5, 4)                    12        
                                                                 
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________

enter image description here

Upvotes: 3

Related Questions