Reputation: 111
I want to optimize 2 loss components separately , so thinking of running two optimizer for them.
is changing learning rate over epoch in tensorflow 2 using custom training loop,in this way ok?
batch_size = 4
epochs = 50
myLearnRate1=1e-4
myLearnRate2=1e-4
X_train,X_test=train_data,val_data
for epoch in range(0, epochs):
train_loss=[]
for i in range(0, len(X_train) // batch_size):
X = X_train[i * batch_size:min(len(X_train), (i+1) * batch_size)]
Image,Mask_p,Mask_n=create_mask(X)
Lr1= myLearnRate1 /(1 + (epoch/37))
Lr2= myLearnRate2 /(1 + (epoch/37))
optimizer1 =tf.keras.optimizers.Adam(learning_rate=Lr1)
optimizer2 =tf.keras.optimizers.Adam(learning_rate=Lr2)
loss=train_on_batch(Image,Mask_p,Mask_n,optimizer1,optimizer2)
basically i am sending the " optimizer along with different learning rate" in each iteration to training function
training function
def train_on_batch(X_original,X_p,X_n,optimizer1,optimizer2):
with tf.GradientTape(persistent=True) as tape:
# Forward pass.
recon,latent=autoencoder_model([X_original,X_p,X_n],training=True)
# Loss value for this batch.
loss_value1_,loss_value2=assymetric_fun(X_original,X_p,X_n,recon)
loss_value1=-1.0*loss_value1_
# make gradient
grads1 = tape.gradient(loss_value1, autoencoder_model.trainable_variables)
grads2 = tape.gradient(loss_value2, autoencoder_model.trainable_variables)
#update weight
optimizer1.apply_gradients(zip(grads1, autoencoder_model.trainable_variables))
optimizer2.apply_gradients(zip(grads2, autoencoder_model.trainable_variables))
return loss_value1_+loss_value2
Upvotes: 3
Views: 787
Reputation: 124
The Adam optimizer (and most optimizers in TF who are stochastic gradient-based) uses information from the previous training step to compute some quantities to update your model parameters.
A problem with your implementation is that each batch you redefine your optimizers. Meaning that you loses the parameters of the optimizers from the previous training step.
The optimizers definition should be out of the training loop. Now let's define how to change the learning rate during training. I know two ways.
initial_learning_rate = 0.1
decay_steps = 1.0
decay_rate = 0.5
learning_rate_fn = keras.optimizers.schedules.InverseTimeDecay(
initial_learning_rate, decay_steps, decay_rate)
optimizer=tf.keras.optimizers.SGD(
learning_rate=learning_rate_fn)
In this case, no scheduler is needed and the learning rate can be assigned with a given value. This post deals with it. In short:
from keras import backend as K
# or
from tensorflow.keras import backend as K
K.set_value(optimizer.learning_rate, 0.001)
The optimizer will update automatically the learning rate.
The global code should look like this:
learning_rate = #some value or scheduler
optimizer = Optimizer(learning_rate)
# training loop
for epoch in range(epochs):
# function that updates the model parameters and compute the losses
training_step(...)
Upvotes: 0