r0f1
r0f1

Reputation: 3106

Find Minimum Of Function That Takes Matrix As Input

Given a numpy array of shape (64,64) (=an image) and an arbitrary function that takes that image as an input, I want to find the image that minimizes the function. Let's say the function computes the contrast.

Example:

import numpy as np

def contrast(X):
    vmin, vmax = int(np.min(X)), int(np.max(X))
    num = vmax - vmin
    denom = vmax + vmin
    if denom == 0:
        return 0
    else:
        return num / denom

img = np.random.randint(256, size=(64,64), dtype=np.uint8)
res = contrast(img)

Scipy offers fmin(), but that function would not work with such a large input. Any ideas how to find the image that minimizes the function?

Upvotes: 2

Views: 177

Answers (1)

Morten
Morten

Reputation: 163

Run the code in google colab.

It is by no means perfect, but you can at least get close to a local minimum¹ with a simple gradient descent optimization and automatic differentiation in e.g. autograd. In order for the automatic gradient to work, you most likely will have to convert the image data to floats, do the optimization, and subsequently convert and cast back to ints. This might in principle cause you to miss minima or find wrong ones or get stuck in local ones.

1: Note that this in no way guarantees that you find a global minimum in any case where such a thing exists, this will find a minimum.

import autograd.numpy as np
from autograd import elementwise_grad

def michelson_contrast(image):
    vmin, vmax = np.min(image), np.max(image)
    if (vmax + vmin) > 1e-15:
        return (vmax - vmin) / (vmax + vmin)
    return 0

For the specific function you specified, the Michelson contrast, the optimization converges extremely slowly,

f = michelson_contrast
df = elementwise_grad(f)
img = np.random.randint(256, size=(100, 100)).astype(np.float64)

# Simple gradient descent.
for i in range(1, (max_iterations := 100000) + 1):
    img -= 10**3 * df(img)

# Round and cast the image back to integer values.
img = np.round(img).astype(int)

but a 100 x 100 random test converges on my laptop in about a minute.

           iter.             function
--------------------------------------
               0         1.0000000000
           10000         0.6198908490
           20000         0.4906918649
           30000         0.3968742592
           40000         0.3204002330
           50000         0.2539835041
           60000         0.1942016682
           70000         0.1386916909
           80000         0.0863448569
           90000         0.0361678029
          100000         0.0003124169

Rounded back to integers, the answer is an exact minimum with f = 0, but there of course exists many (256 of them to be exact):

[[146 146 146 ... 146 146 146]
 [146 146 146 ... 146 146 146]
 [146 146 146 ... 146 146 146]
 ...
 [146 146 146 ... 146 146 146]
 [146 146 146 ... 146 146 146]
 [146 146 146 ... 146 146 146]]

A different example, the RMS contrast, converges much faster (less than a second)

def rms_contrast(image):
    N = image.size
    image_mean = np.mean(image)
    return np.sum((image - image_mean)**2) / N

f = rms_contrast
df = elementwise_grad(f)
img = np.random.randint(256, size=(100, 100)).astype(np.float64)

for i in range(1, (max_iterations := 100) + 1):
    img -= 10**3 * df(img)

img = np.round(img).astype(int)

with

           iter.             function
--------------------------------------
               0      5486.3646543900
              10        63.2534779216
              20         0.7292629494
              30         0.0084078294
              40         0.0000969357
              50         0.0000011176
              60         0.0000000129
              70         0.0000000001
              80         0.0000000000
              90         0.0000000000
             100         0.0000000000

and resulting image (again a perfect minimum after casting back to integers).

[[126 126 126 ... 126 126 126]
 [126 126 126 ... 126 126 126]
 [126 126 126 ... 126 126 126]
 ...
 [126 126 126 ... 126 126 126]
 [126 126 126 ... 126 126 126]
 [126 126 126 ... 126 126 126]]

Unless the function is very complicated or computationally expensive, or the input image is enormous, this approach should at least get you somewhat closer to your answer.

Upvotes: 1

Related Questions