Fabian Stolz
Fabian Stolz

Reputation: 2095

Filling a 3D Matrix by for loops with values in R

I set up a 3 dimensinoal matrix of size 365x7x4.

x <- array(rep(1, 365*5*4), dim=c(365, 5, 4))

Now I would to use a for loop to fill each element with a value. Lets say the value of each element should be sum of row, column and depth. I guess this is relatively easy.

Thanks! best, F

Upvotes: 2

Views: 4514

Answers (2)

Gavin Simpson
Gavin Simpson

Reputation: 174948

Using a simpler example so we can see what is being done

arr <- array(seq_len(3*3*3), dim = rep(3,3,3))

the following code gives the requested output:

dims <- dim(arr)
ind <- expand.grid(lapply(dims, seq_len))
arr[] <- rowSums(ind)

The above gives

> arr
, , 1

     [,1] [,2] [,3]
[1,]    3    4    5
[2,]    4    5    6
[3,]    5    6    7

, , 2

     [,1] [,2] [,3]
[1,]    4    5    6
[2,]    5    6    7
[3,]    6    7    8

, , 3

     [,1] [,2] [,3]
[1,]    5    6    7
[2,]    6    7    8
[3,]    7    8    9

> arr[1,1,1]
[1] 3
> arr[1,2,3]
[1] 6
> arr[3,3,3]
[1] 9

Update: Using the example in @TimP's Answer here I update the Answer to show how it can be done in a more R-like fashion.

Given

arr <- array(seq_len(3*3*3), dim = rep(3,3,3))

Replace elements of arr with i + j + k unless k > 2, in which case j*k-i is used instead.

dims <- dim(arr)
ind <- expand.grid(lapply(dims, seq_len))
## which k > 2
want <- ind[,3] > 2
arr[!want] <- rowSums(ind[!want, ])
arr[want] <- ind[want, 2] * ind[want, 3] - ind[want, 1]

Whilst it is tempting to stick with familiar idioms like looping, and contrary to popular belief loops are not inefficient in R, learning to think in a vectorised way will pay off many times over as you learn the language and start applying it to data analysis task.

Here are some timings on Fabian's example:

> x <- array(rep(1, 365*5*4), dim=c(365, 5, 4))
> system.time({
+ for (i in seq_len(dim(x)[1])) {
+     for (j in seq_len(dim(x)[2])) {
+         for (k in seq_len(dim(x)[3])) {
+             val = i+j+k
+             if (k > 2) {
+                 val = j*k-i
+             }
+             x[i,j,k] = val
+         }
+     }
+ }
+ })
   user  system elapsed 
  0.043   0.000   0.044 
> arr <- array(rep(1, 365*5*4), dim=c(365, 5, 4))
> system.time({
+ dims <- dim(arr)
+ ind <- expand.grid(lapply(dims, seq_len))
+ ## which k > 2
+ want <- ind[,3] > 2
+ arr[!want] <- rowSums(ind[!want, ])
+ arr[want] <- ind[want, 2] * ind[want, 3] - ind[want, 1]
+ })
   user  system elapsed 
  0.005   0.000   0.006

and for a much larger (for my ickle laptop at least!) problem

> x <- array(rep(1, 200*200*200), dim=c(200, 200, 200))
> system.time({
+ for (i in seq_len(dim(x)[1])) {
+     for (j in seq_len(dim(x)[2])) {
+         for (k in seq_len(dim(x)[3])) {
+             val = i+j+k
+             if (k > 2) {
+                 val = j*k-i
+             }
+             x[i,j,k] = val
+         }
+     }
+ }
+ })
   user  system elapsed 
 51.759   0.129  53.090
> arr <- array(rep(1, 200*200*200), dim=c(200, 200, 200))
> system.time({
+     dims <- dim(arr)
+     ind <- expand.grid(lapply(dims, seq_len))
+     ## which k > 2
+     want <- ind[,3] > 2
+     arr[!want] <- rowSums(ind[!want, ])
+     arr[want] <- ind[want, 2] * ind[want, 3] - ind[want, 1]
+ })
   user  system elapsed 
  2.282   1.036   3.397 

but even that may be modest to small by today's standards. You can see that the looping starts to become ever more uncompetitive because of the all the function calls required by that method.

Upvotes: 7

Tim P
Tim P

Reputation: 1383

Fabian: from the phrasing of your question, I believe you're just looking for a simple way of setting values in the array to follow any set of rules you might devise. No problem.

Your array is small (and from the context I strongly suspect you only want to use the code for something of that size). So good practice is simply to use a set of three for loops, which will run almost instantly - no need for any unnecessary complications. My code below shows an example: here we set element x[i,j,k] to be i+j+k, unless k>2, in which case we set it to be j*k-i instead.

Obviously, you can have as many rules as you want - just add an if statement for each one, and define val to be the value you want x[i,j,k] to take if that condition is true. (There's a few different ways to set this up, but this one seems the simplest to understand.) At the end of the innermost loop, x[i,j,k] gets set to the required value (val), and we then go on and do the next element until they're all done. That's it!

x = array(rep(1, 365*5*4), dim=c(365, 5, 4))

for (i in seq_len(dim(x)[1])) {
    for (j in seq_len(dim(x)[2])) {
        for (k in seq_len(dim(x)[3])) {
            val = i+j+k
            if (k > 2) {
                val = j*k-i
            }
            x[i,j,k] = val
        }
    }
}

Hope this helps :)

Quick update (non-loopy method): For completeness, if you're in a real hurry and want your code to run in 0.07 seconds rather than 0.19 seconds... you could also set things up in a vectory way like this:

comb = expand.grid(seq_len(365), seq_len(5), seq_len(4))
i = comb$Var1; j = comb$Var2; k = comb$Var3
val = i+j+k
subs = which(k>2); val[subs] = (j*k-i)[subs]
x = array(val, dim = c(365, 5, 4))

In the above, the variables i, j and k are vectors with length 7300 (the number of cells in the array). As before, the default choice for val is the sum i+j+k except on the subset k>2, where val is j*k-i instead - exactly the same as the example in the first part of my answer. Obviously the notation in this method is quite a bit harder which is why I thought it'd be better to show you the loop-based solution. Hopefully you'll see how you could add in other conditions to the above though. The final line maps the vector val over to the array x in the right way so that each x[i,j,k] takes on the correct value of val. Try it and see :)

One small point to note though: if you were ever to want to run this sort of algorithm on a massive array (much, much, much bigger than the one you have now), then the approach immediately above would definitely be the one to use to minimise runtime. For your case, my advice is to use whichever one you feel more comfortable with as the runtime isn't really an issue.

Cheers! :)

Upvotes: 3

Related Questions