Reputation: 111
I would like to use tf.contrib.estimator.clip_gradients_by_norm in TF 2.0 as is possible under TF 1.3, however with contrib now gone I need a workaround, or even just some underlying intuition on how it works.
I am aware that this issue has been raised as an issue on Github (https://github.com/tensorflow/tensorflow/issues/28707) but would like a solution sooner if possible.
# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
More here:
I've tried using custom gradients as described here: https://www.tensorflow.org/guide/eager
@tf.custom_gradient
def clip_gradient_by_norm(x, norm):
y = tf.identity(x)
def grad_fn(dresult):
return [tf.clip_by_norm(dresult, norm), None]
return y, grad_fn
with no success.
Upvotes: 3
Views: 1706
Reputation: 89
Looking at a comment on this issue https://github.com/tensorflow/tensorflow/issues/28707#issuecomment-502336827,
I discovered that you can modify your code to look like this:
# Use gradient descent as the optimizer for training the model.
from tensorflow.keras import optimizers
my_optimizer = optimizers.SGD(lr=0.0000001, clipnorm=5.0)
# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
Instead of:
# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
Upvotes: 5