Reputation: 1912
I am using Tensorflow to minimize a function. The function takes about 10 parameters. Every single parameter has bounds, e.g. a minimum and a maximum value the parameter is allowed to take. For example, the parameter x1 needs to be between 1 and 10.
I also have a pair of parameters that need to have the following constraint x2 > x3. In other words, x2 must always be bigger than x3. (In addition to this, x2 and x3 also have bounds, similarly to the example of x1 above.)
I know that tf.Variable has a "constraint" argument, however I can't really find any examples or documentation on how to use this to achieve the bounds and constraints as mentioned above.
Thank you!
Upvotes: 2
Views: 3728
Reputation: 21
As suggested by Slowpoke, the Approach 1 seems to work well with some slight modifications in my case (in Python 3.10, Tensorflow 2.14.0):
tf.Variable
with the pre-defined constraint function:z = tf.Variable(..., constraint=lambda x: tf.clip_by_value(x, 0.0, 1.0))
z
after being updated by the Optimizer (see the official documentation), by following the pre-defined constraint (i.e., between 0.0
and 1.0
):with tf.GradientTape() as tape:
tape = loss_func(...)
grads = tape.gradient(loss_func, z)
optimizer.apply_gradients(grads, z)
Upvotes: 0
Reputation: 68140
In addition to the answer by Slowpoke, reparameterization is another option. E.g. let's say you have a param p
which should be bounded in [lower_bound,upper_bound], you could write:
p_inner = tf.Variable(...) # unbounded
p = tf.sigmoid(p_inner) * (upper_bound - lower_bound) + lower_bound
However, this will change the behavior of gradient descent.
Upvotes: 0
Reputation: 1079
It seems to me (I can be mistaken) that constrained optimization (you can google for it in tensorflow) is not exactly the case for which tensroflow was designed. You may want to take a look at this repo, it may satisfy your needs, but as far as I understand, it's still not solving arbitrary constrained optimization, just some classification problems with labels and features, compatible with precision/recall scores.
If you want to use constraints on the tensorflow variable (i.e. some function applied after gradient step - which you can do manually also - by taking variable values, doing manipulations, and reassigning then), it means that you will be cutting variables after each step done using gradient in general space. It's a question whether you will successfully reach the right optimization goal this way, or your variables will stuck at boundaries, because general
gradient will point somewhere outside.
My approach 1
If your problem is simple enough. you can try to parametrize your x2
and x3
as x2 = x3 + t
, and then try to do cutting in the graph:
x3 = tf.get_variable('x3',
dtype=tf.float32,
shape=(1,),
initializer=tf.random_uniform_initializer(minval=1., maxval=10.),
constraint=lambda z: tf.clip_by_value(z, 1, 10))
t = tf.get_variable('t',
dtype=tf.float32,
shape=(1,),
initializer=tf.random_uniform_initializer(minval=1., maxval=10.),
constraint=lambda z: tf.clip_by_value(z, 1, 10))
x2 = x3 + t
Then, on a separate call additionally do
sess.run(tf.assign(x2, tf.clip_by_value(x2, 1.0, 10.0)))
But my opinion is that it won't work well.
My approach 2
I would also try to invent some loss terms to keep variables within constraints, which is more likely to work. For example, constraint for x2 to be in the interval [1,10]
will be:
loss += alpha*tf.abs(tf.math.tan(((x-5.5)/4.5)*pi/2))
Here the expression under tan
is brought to -pi/2,pi/2
and then tan
function is used to make it grow very rapidly when it reaches boundaries. In this case I think you're more likely to find your optimum, but again the loss weight alpha
might be too big and training will stuck somewhere nearby, if required value of x2
lies near the boundary. In this case you can try to use smaller alpha
.
Upvotes: 5