Reputation: 87
Sorry, I am new to deep learning and keras. I am trying to define a layer myself.
I looked into the keras document, https://keras.io/api/layers/base_layer/#layer-class
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
return tf.matmul(inputs, self.w) + self.b
# Instantiates the layer.
linear_layer = SimpleDense(4)
I understand when I create linear_layer
, the __init__
method is called, and when I put inputs into linear_layer
, the call
method is called. But I don’t get when the build
method is called, more specifically, how is input_shape
in build method specified? What is the input_shape
here? I don’t know when the build
method is called so I don’t know what arguments are put in as input_shape
argument.
Besides, I want to specify a parameter with a fixed size, which is (1,768) in my case. So in this case, should I still use input_shape
in build method?
Upvotes: 2
Views: 1967
Reputation: 24049
To know about this SimpleDense
layer and answer your questions, we need to explain weight
and bias
. weight in SimpleDense
first gets random numbers and bias
gets zero
numbers and in the training of the model, this weight and bias change to minimize the loss. The answer to First Question: The build method only one-time calls, and in the first use of layer, this method is calling, and The weight and bias are set to random and zero numbers but The call method in each training batch is calling. The answer to the Second Question: Yes, in the call methods, we have access to a batch of data and the first dimension shows the batch. I write an example that print when the build
and call
method is calling and print the shape of input and output data to clarify the above explanation.
In the below example :
batch_size = 5
and 25 sample data
, and in each epoch, we can see in the call method
access to 5 sample data
.one-time
layer create and build and one-time build method
is calling, 5 epoch
and 5-time call method
is calling.Units = 4
and shape data = (100, 2) [sample, features]
then total params = 12
<-> 4*2 (weights*features) + 4 (bias)
matmul
working and why the output shape is (5,4)
, and the formula for computing of intput*weight+bias.
import tensorflow as tf
class SimpleDense(tf.keras.layers.Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
tf.print('calling build method')
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(initial_value=b_init(shape=(self.units,),
dtype='float32'),trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
tf.print('\ncalling call method')
tf.print(f'input shape : {inputs.shape}')
out = tf.matmul(inputs, self.w) + self.b
tf.print(f'output shape : {out.shape}')
return out
model = tf.keras.Sequential()
model.add(SimpleDense(units = 4))
model.compile(optimizer = 'adam',loss = 'mse',)
model.fit(tf.random.uniform((25, 2)), tf.ones((25, 1)), batch_size = 5)
model.summary()
Output:
calling build method
calling call method
input shape : (5, 2)
output shape : (5, 4)
1/5 [=====>........................] - ETA: 1s - loss: 0.9794
calling call method
input shape : (5, 2)
output shape : (5, 4)
calling call method
input shape : (5, 2)
output shape : (5, 4)
calling call method
input shape : (5, 2)
output shape : (5, 4)
calling call method
input shape : (5, 2)
output shape : (5, 4)
5/5 [==============================] - 0s 15ms/step - loss: 0.9770
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_dense (SimpleDense) (5, 4) 12
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________
Upvotes: 3