Marcin Gadomski
Marcin Gadomski

Reputation: 35

The greatest sum in part of matrix array

I am stuck at a simple problem, I am looking for a better solution than my.

I have an integers matrix array (tab[N][M]) and integer (k) and I have to find the smallest rectangle (sub matrix array) that has sum of it's elements greater then k

So, my current attempt of a solution is:

  1. Make additional matrix array (sum[N][M]) and integer solution = infinity
  2. For each 1 < i <= N + 1 and 1 < j <= M + 1

    sum[ i ][ j ] = sum[ i - 1 ][ j ] + sum [ i ][ j - 1] + tab[ i  ] [ j ] - sum[ i - 1] [ j - 1]  
    
  3. Then look on each rectangle f.e rectangle that starts at (x, y) and ends (a, b)

    Rectangle_(x,y)_(a,b) = sum[ a ][ b ] - sum[ a - x ] [ b ] - sum[ a ][ b - y ] + sum[ a - x ][ b - y ]
    

    and if Rectangle_(x,y)_(a,b) >= k then solution = minimum of current_solution and (a - x) * (b - y)

But this solution is quite slow (quartic time), is there any possibility to make it faster? I am looking for iterated logarithmic time (or worse/better). I managed to reduce my time , but not substantially.

Upvotes: 2

Views: 207

Answers (2)

samgak
samgak

Reputation: 24417

If the matrix only contains values >= 0, then there is a linear time solution in the 1D case that can be extended to a cubic time solution in the 2D case.

For the 1D case, you do a single pass from left to right, sliding a window across the array, stretching or shrinking it as you go so that the numbers contained in the interval always sum to at least k (or breaking out of the loop if this is not possible).

Initially, set the left index bound of the interval to the first element, and the right index bound to -1, then in a loop:

  • Increment the right bound by 1, and then keep incrementing it until either the values inside the interval sum to > k, or end of the array is reached.
  • Increment the left bound to shrink the interval as small as possible without letting the values sum to less than or equal to k.
  • If the result is a valid interval (meaning the first step did not reach the end of the array without finding a valid interval) then compare it to the smallest so far and update if necessary.

This doesn't work if negative values are allowed, because in the second step you need to be able to assume that shrinking the interval always leads to a smaller sum, so when the sum dips below k you know that's the smallest possible for a given interval endpoint.

For the 2D case, you can iterate over all possible sub-matrix heights, and over each possible starting row for a given height, and perform this horizontal sweep for each row.

In pseudo-code:

Assume you have a function rectangle_sum(x, y, a, b) that returns the sum of the values from (x, y) to (a, b) inclusive and runs in O(1) time used a summed area table.

for(height = 1; height <= M; height++)  // iterate over submatrix heights
{
    for(row = 0; row <= (M-h); row++)   // iterate over all rows
    {
        start = 0; end = -1; // initialize interval
        while(end < N)  // iterate across the row
        {
            valid_interval = false;
            // increment end until the interval sums to > k:
            while(end < (N-1))
            {
                end = end + 1;
                if(rectangle_sum(start, row, end, row + height) > k)
                {
                    valid_interval = true;
                    break;
                }
            }
            if(!valid_interval)
                break;
            // shrink interval by incrementing start:
            while((start < end) && 
                rectangle_sum(start+1, row, end, row + height) > k))
                start = start + 1;

            compare (start, row), (end, row + height) with current smallest
            submatrix and make it the new current if it is smaller
        }
    }
}

Upvotes: 0

mcdowella
mcdowella

Reputation: 19601

I have seen a number of answers to matrix rectangle problems here which worked by solving a similar 1-dimensional problem and then applying this to every row of the matrix, every row formed by taking the sum of two adjacent rows, every sum from three adjacent rows, and so on. So here's an attempt at finding the smallest interval in a line which has at least a given sum. (Clearly, if your matrix is tall and thin instead of short and fat you would work with columns instead of rows)

Work from left to right, maintaining the sums of all prefixes of the values seen so far, up to the current position. The value of an interval ending in a position is the sum up to and including that position, minus the sum of a prefix which ends just before the interval starts. So if you keep a list of the prefix sums up to just before the current position you can find, at each point, the shortest interval ending at that point which passes your threshold. I'll explain how to search for this efficiently in the next paragraph.

In fact, you probably don't need a list of all prefix sums. Smaller prefix sums are more valuable, and prefix sums which end further along are more valuable. So any prefix sum which ends before another prefix sum and is also larger than that other prefix sum is pointless. So the prefix sums you want can be arranged into a list which retains the order in which they were calculated but also has the property that each prefix sum is smaller than the prefix sum to the right of it. This means that when you want to find the closest prefix sum which is at most a given value you can do this by binary search. It also means that when you calculate a new prefix sum you can put it into its place in the list by just discarding all prefix sums at the right hand end of the list which are larger than it, or equal to it.

Upvotes: 0

Related Questions