compbiostats
compbiostats

Reputation: 961

Populating subarrays in R [<-`*tmp*' : subscript out of bounds

I am building a simulation to randomly assign character labels to subarrays in R based on user-defined parameters.

My code is as follows

K <- 2        ### Number of subarrays 
K1 <- c(1:3)  ### labels in first subarray
K2 <- c(4:5)  ### labels in second subarray

N <- 10 
Hstar <- 5 

perms <- 10  ### rows in each subarray

specs <- 1:N
specs1 <- 1:(N/2) ### specs in subarray 1
specs2 <- ((N/2) + 1):N ### specs in subarray 2

pop <- array(dim = c(c(perms, N/K), K)) ### population subarrays

haps <- as.character(1:Hstar) ### character labels

probs <- rep(1/Hstar, Hstar)  ### label probabilities


### 'for' loop to randomly populate 'pop' with 'haps' according to 'probs'

for(j in 1:perms){
    for(i in 1:K){
        if(i == 1){
            pop[j, specs, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
    }
    else{
        pop[j, specs1,  1] <- sample(haps[K1], size = N/2, replace = TRUE, prob = probs[K1])
        pop[j, specs2,  2] <- sample(haps[K2], size = N/2, replace = TRUE, prob = probs[K2])    
    }
  }
}

What I want to do is populate (by rows, not columns) 'pops', which consists of two subarrays, with character labels ('haps'). Specifically, subarray 1 needs to contain only labels from K1 and subarray 2 must only contain labels from K2. 'pop' has dimension 10 x 5 x 2 (50 values in subarray 1, and the remaining 50 in subarray 2). Unfortunately, R throws the error

Error in `[<-`(`*tmp*`, j, specs, i, value = c("4", "1", "3", "4", "1",  : 
subscript out of bounds

when the nested 'for' loop is run, and I can't seem to understand why. I believe it has to do with specs, specs1, specs2. Basically, the values from 'specs' are divided between 'specs1' and 'specs2'. However, the error suggests that the issue lies in pop[j, specs, i], but since K = 2, this part of the program should not be affected... and yet it is.

Any ideas on how to fix the issue so that the program runs for ANY value of K?

Please let me know if more clarification is needed.

Upvotes: 0

Views: 117

Answers (2)

cderv
cderv

Reputation: 6542

R is language very efficient with vectorisation. You can use this feature to prevent using for-loop.

To make the code work, I needed to correct a few error :

  1. specs refers to first dimension of your array not second.
  2. I think specs1 and specs2 refers to second subarray (i=2 in your example). I modified following that.

To fill the arrays, I generate sample of size corresponding to that array you want to fill. I used length and dim for that. Array is filled by column ie first column every rows then second column etc...


K <- 2        ### Number of subarrays 
K1 <- c(1:3)  ### labels in first subarray
K2 <- c(4:5)  ### labels in second subarray

N <- 10 
Hstar <- 5 

perms <- 10  ### rows in each subarray

specs <- 1:N
specs1 <- 1:(N/2) ### specs in subarray 1
specs2 <- ((N/2) + 1):N ### specs in subarray 2

pop <- array(dim = c(c(perms, N/K), K)) ### population subarrays
haps <- as.character(1:Hstar) ### character labels
probs <- rep(1/Hstar, Hstar)  ### label probabilities

pop[specs, , 1] <- sample(haps, size = length(specs) * dim(pop)[2], replace = TRUE, prob = probs)
pop[specs1, , 2] <- sample(haps[K1], size = length(specs1) * dim(pop)[2], replace = TRUE, prob = probs[K1])
pop[specs2, , 2] <- sample(haps[K2], size = length(specs2) * dim(pop)[2], replace = TRUE, prob = probs[K2])

pop
#> , , 1
#> 
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,] "4"  "3"  "2"  "3"  "2" 
#>  [2,] "5"  "4"  "3"  "1"  "4" 
#>  [3,] "1"  "3"  "4"  "3"  "5" 
#>  [4,] "3"  "3"  "5"  "5"  "3" 
#>  [5,] "2"  "4"  "3"  "4"  "4" 
#>  [6,] "3"  "3"  "2"  "4"  "1" 
#>  [7,] "5"  "1"  "4"  "4"  "1" 
#>  [8,] "4"  "3"  "2"  "3"  "2" 
#>  [9,] "3"  "2"  "3"  "3"  "1" 
#> [10,] "3"  "4"  "1"  "4"  "2" 
#> 
#> , , 2
#> 
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,] "3"  "3"  "2"  "1"  "3" 
#>  [2,] "2"  "2"  "2"  "2"  "2" 
#>  [3,] "2"  "2"  "2"  "2"  "1" 
#>  [4,] "2"  "3"  "2"  "3"  "1" 
#>  [5,] "1"  "2"  "2"  "3"  "2" 
#>  [6,] "5"  "5"  "5"  "4"  "5" 
#>  [7,] "4"  "5"  "4"  "5"  "5" 
#>  [8,] "5"  "5"  "4"  "5"  "5" 
#>  [9,] "4"  "5"  "5"  "4"  "4" 
#> [10,] "5"  "4"  "5"  "5"  "4"

I think you build on that to parametrized allowing use of any K value.

Upvotes: 0

GoGonzo
GoGonzo

Reputation: 2867

Let me divide error on parts. Line below has incorrectly specified assignment dimension. I noticed there some inconsistency, because you are trying to loop by row (10 iterations) and each row has 5 elements (5 columns). I suspect you were going to loop by column, so it should be perms=5.

Just to picture this issue, if you debug code by each element, you will see that pop[j, specs, i]. You are trying to refer to pop[ 1 , 1:10 , 1], and your subarray has dimension 10x5, which means that you have to switch rather to pop[,1,1] (you don't need to specify 1:10 as far as it is the whole column).

pop[j, specs, i] <- sample(haps, size = N, replace = TRUE, prob = probs)


sample(haps, size = N, replace = TRUE, prob = probs)
# [1] "3" "1" "4" "3" "2" "1" "1" "1" "2" "2"
pop[j, specs, i]
# Error in pop[j, specs, i] : subscript out of bounds
pop[specs, j, i]
# [1] "5" "2" "1" "4" "3" "5" "1" "5" "5" "2"

pop[, j, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
#      [,1] [,2] [,3] [,4] [,5]
# [1,] "5"  NA   NA   NA   NA  
# [2,] "1"  NA   NA   NA   NA  
# [3,] "4"  NA   NA   NA   NA  
# [4,] "1"  NA   NA   NA   NA  
# [5,] "1"  NA   NA   NA   NA  
# [6,] "2"  NA   NA   NA   NA  
# [7,] "5"  NA   NA   NA   NA  
# [8,] "5"  NA   NA   NA   NA  
# [9,] "3"  NA   NA   NA   NA  
#[10,] "3"  NA   NA   NA   NA  

Same issue emerges in the else part, where I can see the same error. Below correct one

pop[specs1 , j,  2] <- sample(haps[K1], size = N/2, replace = TRUE, prob = probs[K1])
pop[specs2 , j,  2] <- sample(haps[K2], size = N/2, replace = TRUE, prob = probs[K2])

Anyway there is a better way to do this task:

pop[,,1] <- 
  apply(
    pop[,,1], 2, 
    function(x) sample(haps, size = N, replace = TRUE, prob = probs) )

pop[specs1,,2] <- 
  apply(
    pop[specs1,,2], 2, function(x)
     sample(haps[K1], size = N/2, replace = TRUE, prob = probs[K1]) )

pop[specs2,,2] <- 
  apply(
    pop[specs2,,2], 2, function(x)
      sample(haps[K2], size = N/2, replace = TRUE, prob = probs[K2]) )

Upvotes: 1

Related Questions