rakeshKM
rakeshKM

Reputation: 111

changing learning rate over epoch in tensorflow 2 using custom training loop

I want to optimize 2 loss components separately , so thinking of running two optimizer for them.

is changing learning rate over epoch in tensorflow 2 using custom training loop,in this way ok?

    batch_size = 4
    epochs = 50
    myLearnRate1=1e-4
    myLearnRate2=1e-4  
    X_train,X_test=train_data,val_data

    for epoch in range(0, epochs):
      train_loss=[]
      for i in range(0, len(X_train) // batch_size):
          X = X_train[i * batch_size:min(len(X_train), (i+1) * batch_size)]
          Image,Mask_p,Mask_n=create_mask(X)
                    
          Lr1= myLearnRate1 /(1 + (epoch/37))    
          Lr2= myLearnRate2 /(1 + (epoch/37))   
          optimizer1 =tf.keras.optimizers.Adam(learning_rate=Lr1)
          optimizer2 =tf.keras.optimizers.Adam(learning_rate=Lr2)
          
          loss=train_on_batch(Image,Mask_p,Mask_n,optimizer1,optimizer2)  

basically i am sending the " optimizer along with different learning rate" in each iteration to training function

training function

def train_on_batch(X_original,X_p,X_n,optimizer1,optimizer2):
  with tf.GradientTape(persistent=True) as tape:
    # Forward pass.
    recon,latent=autoencoder_model([X_original,X_p,X_n],training=True)
    # Loss value for this batch.
    loss_value1_,loss_value2=assymetric_fun(X_original,X_p,X_n,recon)
    loss_value1=-1.0*loss_value1_
    
  # make gradient
  grads1 = tape.gradient(loss_value1, autoencoder_model.trainable_variables)
  grads2 = tape.gradient(loss_value2, autoencoder_model.trainable_variables)
  
  #update weight
  optimizer1.apply_gradients(zip(grads1, autoencoder_model.trainable_variables))
  optimizer2.apply_gradients(zip(grads2, autoencoder_model.trainable_variables))
  
  return loss_value1_+loss_value2

Upvotes: 3

Views: 787

Answers (1)

L Maxime
L Maxime

Reputation: 124

The Adam optimizer (and most optimizers in TF who are stochastic gradient-based) uses information from the previous training step to compute some quantities to update your model parameters.

A problem with your implementation is that each batch you redefine your optimizers. Meaning that you loses the parameters of the optimizers from the previous training step.

The optimizers definition should be out of the training loop. Now let's define how to change the learning rate during training. I know two ways.

Learning rate Scheduler

initial_learning_rate = 0.1
decay_steps = 1.0
decay_rate = 0.5
learning_rate_fn = keras.optimizers.schedules.InverseTimeDecay(
  initial_learning_rate, decay_steps, decay_rate)

optimizer=tf.keras.optimizers.SGD(
                  learning_rate=learning_rate_fn)

"Manually" assign the learning rate

In this case, no scheduler is needed and the learning rate can be assigned with a given value. This post deals with it. In short:

from keras import backend as K
# or 
from tensorflow.keras import backend as K
K.set_value(optimizer.learning_rate, 0.001)

The optimizer will update automatically the learning rate.

The global code should look like this:

learning_rate = #some value or scheduler
optimizer = Optimizer(learning_rate)

# training loop
for epoch in range(epochs):
    # function that updates the model parameters and compute the losses
    training_step(...)

Upvotes: 0

Related Questions