Reputation: 303
Given the following code:
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.layers import Input, Dense, Lambda, Add, Conv2D, Flatten
from tensorflow.keras.optimizers import RMSprop
X = Flatten(input_shape=input_shape)(X_input)
X = Dense(512, activation="elu", kernel_initializer='he_uniform')(X)
action = Dense(action_space, activation="softmax", kernel_initializer='he_uniform')(X)
value = Dense(1, kernel_initializer='he_uniform')(X)
Actor = Model(inputs = X_input, outputs = action)
Actor.compile(loss=ppo_loss, optimizer=RMSprop(learning_rate=lr))
Critic = Model(inputs = X_input, outputs = value)
Critic.compile(loss='mse', optimizer=RMSprop(learning_rate=lr))
Actor.fit(...)
Critic.predict(...)
Are Actor and Critic seperate networks or do i partially fit Critic
with Actor.fit()
?
Upvotes: 1
Views: 86
Reputation: 2822
The network Critic
and Actor
share same networks for most of the part, except the last layer where Actor
has action
layer and Critic
has value
layer. This can be visible when you do a Actor.summary()
compared to Critic.summary()
. See below.
Actor.summary()
Model: "model_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_8 (InputLayer) [(None, 784)] 0
flatten_6 (Flatten) (None, 784) 0
dense_16 (Dense) (None, 512) 401920
dense_17 (Dense) (None, 1) 513
=================================================================
Total params: 402,433
Trainable params: 402,433
Non-trainable params: 0
Critic.summary()
Model: "model_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_8 (InputLayer) [(None, 784)] 0
flatten_6 (Flatten) (None, 784) 0
dense_16 (Dense) (None, 512) 401920
dense_18 (Dense) (None, 1) 513
=================================================================
Total params: 402,433
Trainable params: 402,433
Non-trainable params: 0
You can see the first three layers have represented with the same name, therefore are same objects in the memory. This can be also verified when you do a layer[n].get_weights()
. This should provide you same weights for identical layer in both networks.
So when you fit on Actor
, except the last layer, the weights of the other layers get adjusted, which also effects the Critic
. But the last layer for Critic
is not trained yet and therefore when you do Crtic.predict()
your predictions are not as per training done on Actor
.
So yes you partially fit Critic
with Actor.fit()
.
Upvotes: 1