Reputation: 355
Let's say I have a function f
defined on interval [0,1]
, which is smooth and increases up to some point a
after which it starts decreasing. I have a grid x[i]
on this interval, e.g. with a constant step size of dx = 0.01
, and I would like to find which of those points has the highest value, by doing the smallest number of evaluations of f
in the worst-case scenario. I think I can do much better than exhaustive search by applying something inspired with gradient-like methods. Any ideas? I was thinking of something like a binary search perhaps, or parabolic methods.
This is a bisection-like method I coded:
def optimize(f, a, b, fa, fb, dx):
if b - a <= dx:
return a if fa > fb else b
else:
m1 = 0.5*(a + b)
m1 = _round(m1, a, dx)
fm1 = fa if m1 == a else f(m1)
m2 = m1 + dx
fm2 = fb if m2 == b else f(m2)
if fm2 >= fm1:
return optimize(f, m2, b, fm2, fb, dx)
else:
return optimize(f, a, m1, fa, fm1, dx)
def _round(x, a, dx, right = False):
return a + dx*(floor((x - a)/dx) + right)
The idea is: find the middle of the interval and compute m1
and m2
- the points to the right and to the left of it. If the direction there is increasing, go for the right interval and do the same, otherwise go for the left. Whenever the interval is too small, just compare the numbers on the ends. However, this algorithm still does not use the strength of the derivatives at points I computed.
Upvotes: 0
Views: 138
Reputation: 1893
I am assuming that the function evaluation is very costly.
In the special case, that your function could be approximately fitted with a polynomial, you can easily calculate the extrema in least number of function evaluations. And since you know that there is only one maximum, a polynomial of degree 2
(quadratic) might be ideal.
For example: If f(x)
can be represented by a polynomial of some known degree, say 2
, then, you can evaluate your function at any 3
points and calculate the polynomial coefficients using Newton's difference or Lagrange interpolation method.
Then its simple to solve for the maximum for this polynomial. For a degree 2
you can easily get a closed form expression for the maximum.
To get the final answer you can then search in the vicinity of the solution.
Upvotes: 0
Reputation:
Such a function is called unimodal.
Without computing the derivatives, you can work by
finding where the deltas x[i+1]-x[i] change sign, by dichotomy (the deltas are positive then negative after the maximum); this takes Log2(n) comparisons; this approach is very close to what you describe;
adapting the Golden section method to the discrete case; it takes Logφ(n) comparisons (φ~1.618).
Apparently, the Golden section is more costly, as φ<2, but actually the dichotomic search takes two function evaluations at a time, hence 2Log2(n)=Log√2(n) .
One can show that this is optimal, i.e. you can't go faster than O(Log(n)) for an arbitrary unimodal function.
If your function is very regular, the deltas will vary smoothly. You can think of the interpolation search, which tries to better predict the searched position by a linear interpolation rather than simple halving. In favorable conditions, it can reach O(Log(Log(n)) performance. I don't know of an adaptation of this principle to the Golden search.
Actually, linear interpolation on the deltas is very close to parabolic interpolation on the function values. The latter approach might be the best for you, but you need to be careful about the corner cases.
If derivatives are allowed, you can use any root solving method on the first derivative, knowing that there is an isolated zero in the given interval.
If only the first derivative is available, use regula falsi. If the second derivative is possible as well, you may consider Newton, but prefer a safe bracketing method.
I guess that the benefits of these approaches (superlinear and quadratic convergence) are made a little useless by the fact that you are working on a grid.
Upvotes: 2
Reputation: 11910
Find points where derivative(of f(x))=(df/dx)=0
While taking derivative, you could leap by k-length steps until derivate changes sign.
When derivative changes sign, take square root of k and continue reverse direction.
When again, derivative changes sign, take square root of new k again, change direction.
Example: leap by 100 elements, find sign change, leap=10 and reverse direction, next change ==> leap=3 ... then it could be fixed to 1 element per step to find exact location.
Upvotes: 0
Reputation: 7482
DISCLAIMER: Haven't test the code. Take this as an "inspiration".
Let's say you have the following 11 points
x,f(x) = (0,3),(1,7),(2,9),(3,11),(4,13),(5,14),(6,16),(7,5),(8,3)(9,1)(1,-1)
you can do something like inspired to the bisection method
a = 0 ,f(a) = 3 | b=10,f(b)=-1 | c=(0+10/2) f(5)=14
from here you can see that the increasing interval is [a,c[ and there is no need to that for the maximum because we know that in that interval the function is increasing. Maximum has to be in interval [c,b]. So at the next iteration you change the value of a s.t. a=c
a = 5 ,f(a) = 14 | b=10,f(b)=-1 | c=(5+10/2) f(6)=16
Again [a,c]
is increasing so a
is moved on the right
you can iterate the process until a=b=c
.
Here the code that implements this idea. More info here:
int main(){
#define STEP (0.01)
#define SIZE (1/STEP)
double vals[(int)SIZE];
for (int i = 0; i < SIZE; ++i) {
double x = i*STEP;
vals[i] = -(x*x*x*x - (0.6)*(x*x));
}
for (int i = 0; i < SIZE; ++i) {
printf("%f ",vals[i]);
}
printf("\n");
int a=0,b=SIZE-1,c;
double fa=vals[a],fb=vals[b] ,fc;
c=(a+b)/2;
fc = vals[c];
while( a!=b && b!=c && a!=c){
printf("%i %i %i - %f %f %f\n",a,c,b, vals[a], vals[c],vals[b]);
if(fc - vals[c-1] > 0){ //is the function increasing in [a,c]
a = c;
}else{
b=c;
}
c=(a+b)/2;
fa=vals[a];
fb=vals[b];
fc = vals[c];
}
printf("The maximum is %i=%f with %f\n", c,(c*STEP),vals[a]);
}
Upvotes: 0