Reputation: 115
I am trying to understand what weights are trained for RNN. For a simple RNN with 1 layer it is easy to understand. For example if the input shape for the time step is [50, 3], there are 3 weights to train for each feature, plus the bias for the weight and plus the weight for the input state. But I am struggling to understand how the paramtres becomes 12, 21, 32 as the number of RNN increases. Thanks for any guidance.
model = Sequential([
SimpleRNN(1, return_sequences = False, input_shape = [50, 3]), # 3 features and 1 per Wx and Wy
Dense(1)
])
model.summary()
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model2.summary()
model3 = Sequential([
SimpleRNN(3, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model3.summary()
model4 = Sequential([
SimpleRNN(4, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
Model: "sequential_20"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_22 (SimpleRNN) (None, 1) 5
_________________________________________________________________
dense_18 (Dense) (None, 1) 2
=================================================================
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_23 (SimpleRNN) (None, 2) 12
_________________________________________________________________
dense_19 (Dense) (None, 1) 3
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_22"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_24 (SimpleRNN) (None, 3) 21
_________________________________________________________________
dense_20 (Dense) (None, 1) 4
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_23"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_25 (SimpleRNN) (None, 4) 32
_________________________________________________________________
dense_21 (Dense) (None, 1) 5
=================================================================
Total params: 37
Trainable params: 37
Non-trainable params: 0
_________________________________________________________________
Upvotes: 0
Views: 199
Reputation: 289
For your model 2:
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
The image below shows you the weights to one of the neurons (5 weights) and you will have 1 bias. So each neuron has 6 parameter and the total parameters count will be 6*2 = 12.
The formula for your example will be:
h * (3 + h) + h
where (3 + h)
is the number of weights for each neuron and the last h
adds the biases to the parameters
Upvotes: 1