A C
A C

Reputation: 111

How to implement clip_gradients_by_norm in TensorFlow 2.0?

I would like to use tf.contrib.estimator.clip_gradients_by_norm in TF 2.0 as is possible under TF 1.3, however with contrib now gone I need a workaround, or even just some underlying intuition on how it works.

I am aware that this issue has been raised as an issue on Github (https://github.com/tensorflow/tensorflow/issues/28707) but would like a solution sooner if possible.

# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

More here:

https://colab.research.google.com/notebooks/mlcc/first_steps_with_tensor_flow.ipynb?utm_source=mlcc&utm_campaign=colab-external&utm_medium=referral&utm_content=firststeps-colab&hl=en#scrollTo=ubhtW-NGU802

I've tried using custom gradients as described here: https://www.tensorflow.org/guide/eager

@tf.custom_gradient
def clip_gradient_by_norm(x, norm):
  y = tf.identity(x)
  def grad_fn(dresult):
    return [tf.clip_by_norm(dresult, norm), None]
  return y, grad_fn

with no success.

Upvotes: 3

Views: 1706

Answers (1)

Ileriayo Adebiyi
Ileriayo Adebiyi

Reputation: 89

Looking at a comment on this issue https://github.com/tensorflow/tensorflow/issues/28707#issuecomment-502336827,

I discovered that you can modify your code to look like this:

# Use gradient descent as the optimizer for training the model.
from tensorflow.keras import optimizers
my_optimizer = optimizers.SGD(lr=0.0000001, clipnorm=5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

Instead of:

# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

Upvotes: 5

Related Questions