Reputation: 1349
I have a 2D array in which values are monotonic. How to find all (x,y) where |f(X,Y) – v1| < t in best way.
Upvotes: 9
Views: 298
Reputation: 50328
Let me turn my comment into an answer:
It's possible to solve the n × n problem in at most 4n − 3 evaluations, and I conjecture that this bound is optimal, or at least close to optimal.
The algorithm is as follows:
Start at top left, i.e. at x = 1, y = n.
Evaluate the function f(x,y). If f(x,y) < v1 + t, increment x, else decrement y.
Repeat from step 2, until y < 1 or x > n, i.e. until you hit the bottom or right edge of the grid.
You've now found the upper edge of the region where −t < f(x,y) − v1 < t. Now do the same for the lower edge; that is, apply the algorithm above, but replace the condition in step 2 with f(x,y) < v1 − t.
Each edge takes at most 2n − 1 function evaluations, and they overlap on at least one point (x = 1, y = n), so 4n − 3 evaluations should be enough in the worst case.
While I haven't managed to prove the optimality of the algorithm given above, I can prove that any solution must require at least 2n − 2 function evaluations in the worst case. Specifically, I can exhibit a family of bimonotone functions f on an n × n grid, such that, even if the value of f at all but 2n − 2 of the grid points is known in advance, the function must still be evaluated separately at each of the remaining points in order to determine which side of the boundary those points fall on.
An example of such a family of functions is given by f(x,y) = x + y + ε(x,y), with v1 = n + 1, t = 1, and |ε(x,y)| < 1 for all x, y. It's easy to see that such functions are monotone increasing in both x and y (at least when restricted to the grid points), and that:
The uncertain super- and subdiagonals each contain n − 1 points, each of which needs to be tested separately — we can freely choose the function ε(x,y) so as to make any subset of those points fall within the tolerance region. Thus, even with this extra advance knowledge, at least 2n − 2 tests are needed to exactly determine the boundaries of the tolerance region.
In fact, this argument can be trivially tweaked to raise the lower bound to 2n − 1. Going beyond that, however, seems to require more complicated arguments, even if it seems intuitively obvious that, without such extra knowledge, we should need at least close to 4n evaluations in the worst case. Still, this proof, combined with the algorithm above, at least shows that the worst-case asymptotic complexity of this problem is Θ(n).
Also, in d ≥ 3 dimensions, this problem can be trivially solved in O(nd−1) function evaluations by dividing the grid into two-dimensional slices and applying the algorithm given above separately to each slice. Further, I believe that a fairly straightforwards generalization of the lower bound proof above can be used to show that the worst-case complexity of the d-dimensional version of this problem is, indeed, Θ(nd−1).
Upvotes: 2
Reputation: 65437
Searching for a value in a matrix with sorted rows and columns is a classic problem. Initialize row
to its maximum value and column
to its minimum value. Examine the matrix entry at row, column
. If it's less than the target value, then increase column
by one. If it's greater than the target value, then decrease row
by one. The number of matrix entries inspected is at most #rows + #columns - 1
.
If this search is continued after the target value is found by decreasing row
by one (respectively, increasing column
by one), then we obtain as a byproduct a determination of which elements are less than (respectively, less than or equal to) the target value. Perform this search algorithm 2|V| times with different target values (less than or equal to v_i - t
, less than v_i + t
) to solve the problem with O(|V|n) matrix elements inspected.
Various optimizations are possible. The most obvious is to cache the matrix values. Another is, instead of stepping by just one, use unbounded binary search to determine the most appropriate increment/decrement. It's unclear in prospect whether the higher worst-case constant factor would be overcome by the savings resulting from large changes in neighboring entries.
Example using Mooing Duck's instance:
To look for elements in the range (48, 52),
look for elements less than or equal to 48
and (separately) elements less than 52.
Looking for elements less or equal to 48:
Go right (>) if the current element is less than or equal to 48.
Go down (v) if the current element is greater than 48
50 60 70 80 90 100
v
40 > 51 60 70 80 90
v
30 50 52 55 70 80
v
30 40 > 45 > 46 > 51 52
v
30 40 42 45 50 51
v
30 40 41 44 49 50
v
done (access would be out of bounds)
The elements less than or equal to 48 are those in some submatrix containing
an examined element found to be less than or equal to 48
and elements to the left and below.
Upvotes: 10
Reputation: 1260
You find all points for a single value in O(2*log n + 4 * n)
.
First, you should find the interval of points, which have acceptable values for the first row. You can do this with two binary searches. Thus, you will obtain c1
and c2
, such that |f(0, c) - v1| < t
for all c
in [c1, c2]
. This initial step takes about O(log n)
.
Then, since the function is monotonically non-decreasing in both X and Y dimensions, you would know that when you increment the row, those two values c1
and c2
would only decrease. Since you can decrement them a total of c1 + c2 <= 2*n
times, and since you can increment the row n
times for both values, we get an upper bound of 4*n
function calls.
Upvotes: 1