Simd
Simd

Reputation: 21263

An efficient algorithm to count the number of integer grids

Consider a square 3 by 3 grid of non-negative integers. For each row i the sum of the integers is set to be r_i. Similarly for each column j the sum of integers in that column is set to be c_j. An instance of the problem is therefore described by 6 non-negative integers.

Is there an efficient algorithm to count how many different assignments of integers to the grid there are given the row and column sum constraints?

Clearly one could enumerate all possible matrices of non-negative integers with values up to sum r_i and check the constraints for each, but that would be insanely slow.

Example

Say the row constraints are 1 2 3 and the column constraints are 3 2 1. The possible integer grids are:

┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│0 0 1│0 0 1│0 0 1│0 1 0│0 1 0│0 1 0│0 1 0│1 0 0│1 0 0│1 0 0│1 0 0│1 0 0│
│0 2 0│1 1 0│2 0 0│0 1 1│1 0 1│1 1 0│2 0 0│0 1 1│0 2 0│1 0 1│1 1 0│2 0 0│
│3 0 0│2 1 0│1 2 0│3 0 0│2 1 0│2 0 1│1 1 1│2 1 0│2 0 1│1 2 0│1 1 1│0 2 1│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘

In practice my main interest is when the total sum of the grid will be at most 100 but a more general solution would be very interesting.

Upvotes: 15

Views: 735

Answers (6)

hivert
hivert

Reputation: 10667

Aside my other answer using Robinson-Schensted-Knuth bijection, here is another solution which doesn't need advanced combinatorics, but some trick in programming solve this problem for arbitrary larger matrix. The first idea that should be used to solve those kind of problems is to use recursion, avoiding recompution things thanks to some memoization or better dynamic programming. Specifically once you have chosen a candidate for the first row, you subtract this first row to the column sum and you are left with the same problem only there is one less row. To avoid recomputing thing you store the result. You can do this

  • either basically in a big table (memoization)

  • or in a more tricky way by storing all the solutions for matrices with n rows and deducing the number of solutions for matrices with n+1 rows (dynamic programming).

Here is a recursive method using memoization in Python:

 # Generator for the rows of sum s which are smaller that maxrow
 def choose_one_row(s, maxrow):
     if not maxrow:
         if s == 0: yield []
         else: return
     else:
         for i in range(0, maxrow[0]+1):
             for res in choose_one_row(s-i, maxrow[1:]):
                 yield [i]+res


 memo = dict()
 def nmat(rsum, colsum):
     # sanity check: sum by row and column must match
     if sum(rsum) != sum(colsum): return 0
     # base case rsum is empty
     if not rsum: return 1
     # convert to immutable tuple for memoization
     rsum = tuple(rsum)
     colsum = tuple(colsum)
     # try if allready computed
     try:
         return memo[rsum, colsum]
     except KeyError:
         pass
     # apply the recursive formula
     res = 0
     for row in choose_one_row(rsum[0], colsum):
         res += nmat(rsum[1:], tuple(a - b for a, b in zip(colsum, row)))
     # memoize the result
     memo[(tuple(rsum), tuple(colsum))] = res
     return res

Then after that:

sage: nmat([3,2,1], [3,2,1])
12

sage: %time nmat([6,5,4,3,2,1], [6,5,4,3,2,1])
CPU times: user 1.49 s, sys: 7.16 ms, total: 1.5 s
Wall time: 1.48 s
8264346

Upvotes: 1

hivert
hivert

Reputation: 10667

It won't help with the problem being #P-hard (if you allow matrices to be of any sizes -- see reference in the comment below), but there is a solution which doesn't amount to enumerate all the matrices but rather a smaller set of objects called semi-standard Young tableaux. Depending on your input, it could go faster, but still being of exponential complexity. Since it's an entire chapter in several algebraic combinatorics book or in Knuth's AOCP 3, I won't go into details here only pointing to the relevant wikipedia pages.

The idea is that using the Robinson–Schensted–Knuth correspondence each of these matrix is in bijection with a pair of tableaux of the same shape, where one of the tableau is filled with integers counted by the row sum, the other by the column sum. The number of tableau of shape U filled with numbers counted by V is called the Kostka Number K(U,V). As a consequence, you end up with a formula such as

#Mat(RowSum, ColSum) = \sum_shape  K(shape, RowSum)*K(shape, ColSum) 

Of course if RowSum == ColSum == Sum:

#Mat(Sum, Sum) = \sum_shape  K(shape, Sum)^2 

Here is your example in the SageMath system:

sage: sum(SemistandardTableaux(p, [3,2,1]).cardinality()^2 for p in  Partitions(6))
12

Here are some larger examples:

sage: sums = [6,5,4,3,2,1]
sage: %time sum(SemistandardTableaux(p, sums).cardinality()^2 for p in Partitions(sum(sums)))
CPU times: user 228 ms, sys: 4.77 ms, total: 233 ms
Wall time: 224 ms
8264346

sage: sums = [7,6,5,4,3,2,1]
sage: %time sum(SemistandardTableaux(p, sums).cardinality()^2 for p in Partitions(sum(sums)))
CPU times: user 1.95 s, sys: 205 µs, total: 1.95 s
Wall time: 1.94 s
13150070522

sage: sums = [5,4,4,4,4,3,2,1]
sage: %time sum(SemistandardTableaux(p, sums).cardinality()^2 for p in Partitions(sum(sums)))
CPU times: user 1.62 s, sys: 221 µs, total: 1.62 s
Wall time: 1.61 s
1769107201498

It's clear that you won't get that fast enumerating matrices.

As requested by גלעד ברקן@ here is a solution with different row and column sums:

sage: rsums = [5,4,3,2,1]; colsums = [5,4,3,3]
sage: %time sum(SemistandardTableaux(p, rsums).cardinality() * SemistandardTableaux(p, colsums).cardinality() for p in Partitions(sum(rsums)))
CPU times: user 88.3 ms, sys: 8.04 ms, total: 96.3 ms
Wall time: 92.4 ms
10233

Upvotes: 3

Koray
Koray

Reputation: 1796

I've tired to optimize the slow option. I get the all combinations and change the code only to get the total count. This is the fastest I could get:

    private static int count(int[] rowSums, int[] colSums)
    {
        int count = 0;
        int[] row0 = new int[3];
        int sum = rowSums[0];
        for (int r0 = 0; r0 <= sum; r0++)
            for (int r1 = 0, max1 = sum - r0; r1 <= max1; r1++)
            {
                row0[0] = r0;
                row0[1] = r1;
                row0[2] = sum - r0 - r1;
                count += getCombinations(rowSums[1], row0, colSums);
            }                    
        return count;
    }
    private static int getCombinations(int sum, int[] row0, int[] colSums)
    {
        int count = 0;
        int max1 = Math.Min(colSums[1] - row0[1], sum);
        int max2 = Math.Min(colSums[2] - row0[2], sum);
        for (int r0 = 0, max0 = Math.Min(colSums[0] - row0[0], sum); r0 <= max0; r0++)
            for (int r1 = 0; r1 <= max1; r1++)
            {
                int r01 = r0 + r1;
                if (r01 <= sum)
                    if ((r01 + max2) >= sum)
                        count++;
            }
        return count;
    }




Stopwatch w2 = Stopwatch.StartNew();
int res = count(new int[] { 1, 2, 3 }, new int[] { 3, 2, 1 });//12
int res1 = count(new int[] { 22, 33, 44 }, new int[] { 30, 40, 29 });//117276
int res2 = count(new int[] { 98, 99, 100}, new int[] { 100, 99, 98});//12743775
int res3 = count(new int[] { 198, 199, 200 }, new int[] { 200, 199, 198 });//201975050
w2.Stop();
Console.WriteLine("w2:" + w2.ElapsedMilliseconds);//322 - 370 on my computer

Upvotes: 2

pkuderov
pkuderov

Reputation: 3571

Is there an efficient algorithm to count how many different assignments of integers to the grid there are given the row and column sum constraints?

upd My answer is wrong for this particular problem, when N is fixed (i.e. becomes constant 3). In this case it is polynomial. Sorry for misleading information.

TL;DR: I think it's at least NP-hard. There is no polinomial algorithm, but maybe there're some heuristic speedups.


For N-by-N grid you have N equations for row sums, N equations for col sums and N^2 non-negative constraints :

enter image description here

For N > 2 this system has more than one possible solution in general. Because there're N^2 unknown variables x_ij and just 2N equations => for N > 2: N^2 > 2N.

You can eliminate 2N - 1 variables to leave with just one equation with K = N^2 - (2N-1) variables getting the sum S. Then you'll have to deal with integer partition problem to find out all possible combinations of K terms to get the S. This problem is NP-complete. And the number of combinations depends not only on the number of terms K, but also on the order of the value S.

This problem reminded me about Simplex method. My first thought was to find just one solution using something like that method and then traverse edges of the convex to find all the possible solutions. And I was hoping that there's an optimal algorithm for that. But no, integer simplex method, which is related to integer linear programming, is NP-hard :(

I hope, there're some kind heuristics for related problems you can use to speedup naive brute force solution.

Upvotes: 5

Maybe a simple 4-nested-loop solution is fast enough, if the total sum is small?

function solve(rowsum, colsum) {
    var count = 0;
    for (var a = 0; a <= rowsum[0] && a <= colsum[0]; a++) {
        for (var b = 0; b <= rowsum[0] - a && b <= colsum[1]; b++) {
            var c = rowsum[0] - a - b;
            for (var d = 0; d <= rowsum[1] && d <= colsum[0] - a; d++) {
                var g = colsum[0] - a - d;
                for (var e = 0; e <= rowsum[1] - d && e <= colsum[1] - b; e++) {
                    var f = rowsum[1] - d - e;
                    var h = colsum[1] - b - e;
                    var i = rowsum[2] - g - h;
                    if (i >= 0 && i == colsum[2] - c - f) ++count;
                }
            }
        }
    }
    return count;
}
document.write(solve([1,2,3],[3,2,1]) + "<br>");
document.write(solve([22,33,44],[30,40,29]) + "<br>");

Upvotes: 4

phatfingers
phatfingers

Reputation: 10250

I don't know of a matching algorithm, but I don't think it would be that difficult to work one out. Given any one solution, you can derive another solution by selecting four corners of a rectangular region of your grid, increasing two diagonal corners by some value and decreasing the other two by that same value. The range for that value will be constrained by the lowest value of each diagonal pair. If you determine the size of all such ranges, you should be able to multiply them together to determine the total possible solutions.

Assuming you described your grid like a familiar spreadsheet alphabetically for columns, and numerically for rows, you could describe all possible regions in the following list:

A1:B2, A1:B3, A1:C2, A1:C3, B1:C2, B1:C3, A2:B3, A2:C3, B2:C3

For each region, we tabulate a range based on the lowest value from each diagonal corner pair. You can incrementally reduce either pair until a member reaches zero because there's no upper bound for the other pair.

Selecting the first solution of your example, we can derive all other possible solutions using this technique.

   A B C
  ┌─────┐
1 │0 0 1│ sum=1
2 │0 2 0│ sum=2
3 │3 0 0│ sum=3
  └─────┘
   3 2 1 = sums

A1:B2 - 1 solution (0,0,0,2)
A1:C2 - 1 solution (0,1,0,0)
A1:B3   1 solution (0,0,3,0)
A1:C3   2 solutions (0,1,3,0), (1,0,2,1)
B1:C2   2 solutions (0,1,2,0), (1,0,1,1)
B1:C3   1 solution (0,1,0,0)
A2:B3   3 solutions (0,2,3,0), (1,1,2,1), (2,0,1,2)
A2:C3   1 solution (0,0,3,0)
B2:C3   1 solution (2,0,0,0)

Multiply all solution counts together and you get 2*2*3=12 solutions.

Upvotes: 4

Related Questions