Reputation: 3106
Given a numpy array of shape (64,64) (=an image) and an arbitrary function that takes that image as an input, I want to find the image that minimizes the function. Let's say the function computes the contrast.
Example:
import numpy as np
def contrast(X):
vmin, vmax = int(np.min(X)), int(np.max(X))
num = vmax - vmin
denom = vmax + vmin
if denom == 0:
return 0
else:
return num / denom
img = np.random.randint(256, size=(64,64), dtype=np.uint8)
res = contrast(img)
Scipy offers fmin()
, but that function would not work with such a large input. Any ideas how to find
the image that minimizes the function?
Upvotes: 2
Views: 177
Reputation: 163
It is by no means perfect, but you can at least get close to a local minimum¹ with a simple gradient descent optimization and automatic differentiation in e.g. autograd
. In order for the automatic gradient to work, you most likely will have to convert the image data to floats, do the optimization, and subsequently convert and cast back to ints. This might in principle cause you to miss minima or find wrong ones or get stuck in local ones.
1: Note that this in no way guarantees that you find a global minimum in any case where such a thing exists, this will find a minimum.
import autograd.numpy as np
from autograd import elementwise_grad
def michelson_contrast(image):
vmin, vmax = np.min(image), np.max(image)
if (vmax + vmin) > 1e-15:
return (vmax - vmin) / (vmax + vmin)
return 0
For the specific function you specified, the Michelson contrast, the optimization converges extremely slowly,
f = michelson_contrast
df = elementwise_grad(f)
img = np.random.randint(256, size=(100, 100)).astype(np.float64)
# Simple gradient descent.
for i in range(1, (max_iterations := 100000) + 1):
img -= 10**3 * df(img)
# Round and cast the image back to integer values.
img = np.round(img).astype(int)
but a 100 x 100
random test converges on my laptop in about a minute.
iter. function
--------------------------------------
0 1.0000000000
10000 0.6198908490
20000 0.4906918649
30000 0.3968742592
40000 0.3204002330
50000 0.2539835041
60000 0.1942016682
70000 0.1386916909
80000 0.0863448569
90000 0.0361678029
100000 0.0003124169
Rounded back to integers, the answer is an exact minimum with f = 0
, but there of course exists many (256 of them to be exact):
[[146 146 146 ... 146 146 146]
[146 146 146 ... 146 146 146]
[146 146 146 ... 146 146 146]
...
[146 146 146 ... 146 146 146]
[146 146 146 ... 146 146 146]
[146 146 146 ... 146 146 146]]
A different example, the RMS contrast, converges much faster (less than a second)
def rms_contrast(image):
N = image.size
image_mean = np.mean(image)
return np.sum((image - image_mean)**2) / N
f = rms_contrast
df = elementwise_grad(f)
img = np.random.randint(256, size=(100, 100)).astype(np.float64)
for i in range(1, (max_iterations := 100) + 1):
img -= 10**3 * df(img)
img = np.round(img).astype(int)
with
iter. function
--------------------------------------
0 5486.3646543900
10 63.2534779216
20 0.7292629494
30 0.0084078294
40 0.0000969357
50 0.0000011176
60 0.0000000129
70 0.0000000001
80 0.0000000000
90 0.0000000000
100 0.0000000000
and resulting image (again a perfect minimum after casting back to integers).
[[126 126 126 ... 126 126 126]
[126 126 126 ... 126 126 126]
[126 126 126 ... 126 126 126]
...
[126 126 126 ... 126 126 126]
[126 126 126 ... 126 126 126]
[126 126 126 ... 126 126 126]]
Unless the function is very complicated or computationally expensive, or the input image is enormous, this approach should at least get you somewhat closer to your answer.
Upvotes: 1