nbro
nbro

Reputation: 15837

How can we apply a constraint to the value of a custom trainable variable?

I have defined a custom variable for a certain layer. I would like this variable to take only positive values. Keras provides constraints, but it seems to me that they are only meant for the kernel_constraint and bias_constraint parameters of the Keras' layers.

Is there a (simple) way of constraining the value of a custom trainable variable (i.e. one created with the method add_weight) in Keras (and TensorFlow)?

Upvotes: 1

Views: 3597

Answers (3)

today
today

Reputation: 33410

I just want to add to the answer given by @xdurch0 that you if you want them to be non-negative, there is already the built-in NonNeg constraint which exactly do this and you can use it as follow:

self.add_weight(..., constraint=tf.keras.constraints.NonNeg())

Upvotes: 3

xdurch0
xdurch0

Reputation: 10474

Tensorflow variables support constraints, and this includes variables created via add_weight. See the docs here.

For example, if you want to force a variable to have values 0 < x < 1:

self.add_weight(shape=some_shape, constraint=lambda x: tf.clip_by_value(x, 0, 1))

In general, constraint should be a function; this function will take the variable as input and returns a new value for the variable. In this case, clipped at 0 and 1.

Note that the way this is implemented is that this function is simply called on the variable after the optimizer does its gradient step. This means that values that "want" to be outside the range will be clipped to hard 0s and 1s, and you might end up with lots of values precisely at this boundary. So as @y.selivonchyk notes, this is not "mathematically sound", i.e. the gradients don't know about the constraint. You might want to combine the constraint with the regularization they propose for the best effect.

Upvotes: 2

y.selivonchyk
y.selivonchyk

Reputation: 9900

It is unlikely to have a strict mathematicly-sound way to keep gradients for that variable to never push it below zero. Yet, you can add prior to your model, which would be: "Variable X should stay non-negative" and increase loss whenever this prior does not hold. The way to do it would be to:

  1. Calculate a mathematical expression that would be positive only when your var is negative, which would be something along the lines of K.sum(K.relu(-var))
  2. Expose the result of this expression as a second output of the model
  3. Apply a linear loss to this output, which would be summed up with your training loss (you can provide a weight for this summation)

This solution would have drawbacks: some of the components might still be negative after an iteration, since "regularizing" gradient would be lagging behind by one iteration (may be remedied with a more strict rule K.sum(K.relu(-var+1)); depending on your loss weight you might end up zeroing out components of this variable.

Upvotes: 1

Related Questions