José Mora
José Mora

Reputation: 23

Nested loops in R, all the possibilities

I need to make a data frame that has all possibilities in which seven variables sum up 100, every variable could be 0 to 100. I implement the following code but it takes to long.

combina <- function(U){
          d<- NULL
          for (i in 0:U) {
            for (j in 0:U) {
              for (k in 0:U) {
                for (l in 0:U) {
                  for (m in 0:U) {
                    for (n in 0:U) {
                      for (o in 0:U) {
                        if (i+j+k+l+m+n+o == U){
                          d <- rbind(d,c(i,j,k,l,m,n,o))
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        return(d)
        }

as you see, i used the U variable for tests and after the 15 it takes forever...

Upvotes: 1

Views: 1770

Answers (2)

Gopala
Gopala

Reputation: 10483

This is one way I can think of solving your problem (on a much smaller scale since I can't want to create 100^7 data). :)

For example, if you have four variables ranging from 0 - 10 and you want to find all combinations of them that add up to 10, here is how you can do:

df <- expand.grid(a = 1:10, b = 1:10, c = 1:10, d = 1:10)
df[rowSums(df) == 10, ]

The output has 84 combinations of values out of the 10,000 (10^4) passible values.

Of course, this solution looks simple, because it creates and stores the 10,000 x 4 data frame first. This gets to be a big problem as size increases.

  • Edited per comment below. Agree, it is much better, and perhaps also more efficient than apply.

Upvotes: 3

Dave Radcliffe
Dave Radcliffe

Reputation: 664

Here is an R function that yields all sequences of k non-negative integers summing to n.

sum_to_N <- function (n, k) {
  combos <- combn(seq(0, n+k-2), k-1)
  as.data.frame(t(rbind(combos, as.integer(n+k-1)) - rbind(0L, combos+1L)))
}

Let's look at an example with smaller numbers. To find all solutions of A + B + C + D = 10 in non-negative integers, we start by finding all combinations of distinct numbers a < b < c between 0 and 12, then we set A = a, B = b - (a+1), C = c - (b+1), and D = 13 - (c+1). In general, to find all ways to write n as a sum of k integers, we start by finding all combinations of (k-1) distinct integers between 0 and n+k-2.

The number of solutions is (n+k-1) choose (k-1). In this case, n = 100 and k = 7, so the number of solutions is (106 choose 6) = 1,705,904,746. This is probably too many rows to fit in a data frame, so you should try to find an approach to your problem that does not involve storing all of the combinations.

Upvotes: 3

Related Questions