Populating subarrays in R [

Question

I am building a simulation to randomly assign character labels to subarrays in R based on user-defined parameters.

My code is as follows

K <- 2        ### Number of subarrays 
K1 <- c(1:3)  ### labels in first subarray
K2 <- c(4:5)  ### labels in second subarray

N <- 10 
Hstar <- 5 

perms <- 10  ### rows in each subarray

specs <- 1:N
specs1 <- 1:(N/2) ### specs in subarray 1
specs2 <- ((N/2) + 1):N ### specs in subarray 2

pop <- array(dim = c(c(perms, N/K), K)) ### population subarrays

haps <- as.character(1:Hstar) ### character labels

probs <- rep(1/Hstar, Hstar)  ### label probabilities


### 'for' loop to randomly populate 'pop' with 'haps' according to 'probs'

for(j in 1:perms){
    for(i in 1:K){
        if(i == 1){
            pop[j, specs, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
    }
    else{
        pop[j, specs1,  1] <- sample(haps[K1], size = N/2, replace = TRUE, prob = probs[K1])
        pop[j, specs2,  2] <- sample(haps[K2], size = N/2, replace = TRUE, prob = probs[K2])    
    }
  }
}

What I want to do is populate (by rows, not columns) 'pops', which consists of two subarrays, with character labels ('haps'). Specifically, subarray 1 needs to contain only labels from K1 and subarray 2 must only contain labels from K2. 'pop' has dimension 10 x 5 x 2 (50 values in subarray 1, and the remaining 50 in subarray 2). Unfortunately, R throws the error

Error in `[<-`(`*tmp*`, j, specs, i, value = c("4", "1", "3", "4", "1",  : 
subscript out of bounds

when the nested 'for' loop is run, and I can't seem to understand why. I believe it has to do with specs, specs1, specs2. Basically, the values from 'specs' are divided between 'specs1' and 'specs2'. However, the error suggests that the issue lies in pop[j, specs, i], but since K = 2, this part of the program should not be affected... and yet it is.

Any ideas on how to fix the issue so that the program runs for ANY value of K?

Please let me know if more clarification is needed.

cderv · Accepted Answer

R is language very efficient with vectorisation. You can use this feature to prevent using for-loop.

To make the code work, I needed to correct a few error :

specs refers to first dimension of your array not second.
I think specs1 and specs2 refers to second subarray (i=2 in your example). I modified following that.

To fill the arrays, I generate sample of size corresponding to that array you want to fill. I used length and dim for that. Array is filled by column ie first column every rows then second column etc...

K <- 2        ### Number of subarrays 
K1 <- c(1:3)  ### labels in first subarray
K2 <- c(4:5)  ### labels in second subarray

N <- 10 
Hstar <- 5 

perms <- 10  ### rows in each subarray

specs <- 1:N
specs1 <- 1:(N/2) ### specs in subarray 1
specs2 <- ((N/2) + 1):N ### specs in subarray 2

pop <- array(dim = c(c(perms, N/K), K)) ### population subarrays
haps <- as.character(1:Hstar) ### character labels
probs <- rep(1/Hstar, Hstar)  ### label probabilities

pop[specs, , 1] <- sample(haps, size = length(specs) * dim(pop)[2], replace = TRUE, prob = probs)
pop[specs1, , 2] <- sample(haps[K1], size = length(specs1) * dim(pop)[2], replace = TRUE, prob = probs[K1])
pop[specs2, , 2] <- sample(haps[K2], size = length(specs2) * dim(pop)[2], replace = TRUE, prob = probs[K2])

pop
#> , , 1
#> 
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,] "4"  "3"  "2"  "3"  "2" 
#>  [2,] "5"  "4"  "3"  "1"  "4" 
#>  [3,] "1"  "3"  "4"  "3"  "5" 
#>  [4,] "3"  "3"  "5"  "5"  "3" 
#>  [5,] "2"  "4"  "3"  "4"  "4" 
#>  [6,] "3"  "3"  "2"  "4"  "1" 
#>  [7,] "5"  "1"  "4"  "4"  "1" 
#>  [8,] "4"  "3"  "2"  "3"  "2" 
#>  [9,] "3"  "2"  "3"  "3"  "1" 
#> [10,] "3"  "4"  "1"  "4"  "2" 
#> 
#> , , 2
#> 
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,] "3"  "3"  "2"  "1"  "3" 
#>  [2,] "2"  "2"  "2"  "2"  "2" 
#>  [3,] "2"  "2"  "2"  "2"  "1" 
#>  [4,] "2"  "3"  "2"  "3"  "1" 
#>  [5,] "1"  "2"  "2"  "3"  "2" 
#>  [6,] "5"  "5"  "5"  "4"  "5" 
#>  [7,] "4"  "5"  "4"  "5"  "5" 
#>  [8,] "5"  "5"  "4"  "5"  "5" 
#>  [9,] "4"  "5"  "5"  "4"  "4" 
#> [10,] "5"  "4"  "5"  "5"  "4"

I think you build on that to parametrized allowing use of any K value.

Populating subarrays in R [<-`tmp' : subscript out of bounds

Answers (2)

Related Questions

Populating subarrays in R [&lt;-`*tmp*&#39; : subscript out of bounds

Answers (2)

Related Questions

Populating subarrays in R [<-`tmp' : subscript out of bounds