Reputation: 15837
I have defined a custom variable for a certain layer. I would like this variable to take only positive values. Keras provides constraints, but it seems to me that they are only meant for the kernel_constraint
and bias_constraint
parameters of the Keras' layers.
Is there a (simple) way of constraining the value of a custom trainable variable (i.e. one created with the method add_weight
) in Keras (and TensorFlow)?
Upvotes: 1
Views: 3597
Reputation: 33410
I just want to add to the answer given by @xdurch0 that you if you want them to be non-negative, there is already the built-in NonNeg
constraint which exactly do this and you can use it as follow:
self.add_weight(..., constraint=tf.keras.constraints.NonNeg())
Upvotes: 3
Reputation: 10474
Tensorflow variables support constraints, and this includes variables created via add_weight
. See the docs here.
For example, if you want to force a variable to have values 0 < x < 1:
self.add_weight(shape=some_shape, constraint=lambda x: tf.clip_by_value(x, 0, 1))
In general, constraint
should be a function; this function will take the variable as input and returns a new value for the variable. In this case, clipped at 0 and 1.
Note that the way this is implemented is that this function is simply called on the variable after the optimizer does its gradient step. This means that values that "want" to be outside the range will be clipped to hard 0s and 1s, and you might end up with lots of values precisely at this boundary. So as @y.selivonchyk notes, this is not "mathematically sound", i.e. the gradients don't know about the constraint. You might want to combine the constraint with the regularization they propose for the best effect.
Upvotes: 2
Reputation: 9900
It is unlikely to have a strict mathematicly-sound way to keep gradients for that variable to never push it below zero. Yet, you can add prior to your model, which would be: "Variable X should stay non-negative" and increase loss whenever this prior does not hold. The way to do it would be to:
K.sum(K.relu(-var))
This solution would have drawbacks: some of the components might still be negative after an iteration, since "regularizing" gradient would be lagging behind by one iteration (may be remedied with a more strict rule K.sum(K.relu(-var+1))
; depending on your loss weight you might end up zeroing out components of this variable.
Upvotes: 1