Reputation: 43
I want to increase the learning rate from batch to batch inside of one epoch, so the first data the Net sees in one epoch has low learning rate and the last data it sees has high learning rate. How do I do this in tf.keras
?
Upvotes: 3
Views: 2880
Reputation: 4289
To modify the learning rate after every epoch, you can use tf.keras.callbacks.LearningRateScheduler
as mentioned in the docs here.
But in our case, we need to modify the learning rate after every batch is passed to the model. We'll use tf.keras.optimizers.schedules.LearningRateSchedule
for this purpose. This would modify the learning rate after each step or a gradient update.
Suppose I have 100 samples in my training dataset and my batch size is 5. The no. of steps will be 100 / 5 = 20 steps. Reframing the statement, in a single epoch, 20 batches will be passed to the model and gradient updates would also occur 20 times ( in a single epoch ).
Using the code given in the docs,
batch_size = 5
num_train_samples = 100
class MyLRSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, initial_learning_rate):
self.initial_learning_rate = initial_learning_rate
def __call__(self, step):
return self.initial_learning_rate / (step + 1)
optimizer = tf.keras.optimizers.SGD(learning_rate=MyLRSchedule(0.1))
The value of step
will be go from 0 to 19 for the 1st epoch, considering our example. For the 2nd epoch, it will go from 20 to 39. For your use-case, we can modify the above like,
batch_size = 5
num_train_samples = 100
num_steps = num_train_samples / batch_size
class MyLRSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, initial_learning_rate):
self.initial_learning_rate = initial_learning_rate
def __call__(self, step):
step_in_epoch = step - (( step // num_steps ) * num_steps )
# Update LR according to step_in_epoch
optimizer = tf.keras.optimizers.SGD(learning_rate=MyLRSchedule(0.1))
The value of step_in_epoch
will go from 0 to 19 for 1st epoch. For 2nd epoch, it will go from 0 to 19 again and likewise for all epochs. Update the LR accordingly.
Make sure that
num_train_samples
is perfectly divisible by the batch size. This would ease the calculation of no. of steps.
Upvotes: 2