Ari.stat
Ari.stat

Reputation: 472

Vectorized Rcpp rbinom with probabilities in Armadillo matrix

I have a symmetric matrix of probabilities with diagonal entries null. Suppose something like

   0    0.5   0.1   0.6
   0.5   0    0.2   0.1
   0.1  0.2    0    0.2
   0.6  0.1   0.2    0

I want to draw a dummy matrix so that the probability of the entry [i,j] be the entry [i,j] in the probabilities matrix. Note that the probabilities matrix I have is an Armadillo matrix (a big matrix 5000x5000). of course, the diagonal dummies should be null because their probabilities are null. I built two functions to do that but they are not fast. I should sample this matrix many times in loops.

mat binom1(mat& prob){
  int n=prob.n_rows;
  mat sample(n,n,fill::zeros);
  NumericVector temp(2);

  for(int i(0);i<n-1;++i){
   for(int j(i+1);j<n;++j){
    temp=rbinom(2,1,prob(i,j));
    sample(i,j)=temp(0); sample(j,i)=temp(1);
   }
  }
 return sample;
}


mat binom2(mat& prob){
  int n=prob.n_rows;
  mat sample(n,n);

  for(int i(0);i<n;++i){
    for(int j(0);j<n;++j){
      sample(i,j)=as<double>(rbinom(1,1,prob(i,j)));
    }
  }
  return sample;
}

The both are slower than vectorized rbinom in R.

   z=matrix(runif(1000^2),1000) #just an example for 1000x1000 matrix
   microbenchmark(rbinom(nrow(z)^2,1,z),binom1(z),binom2(z))

Results

               expr       min        lq        mean     median   uq      max
rbinom(nrow(z)^2, 1, z)  95.43756  95.94606  98.29283  97.5273 100.3040 108.2293
               binom1(z) 131.33937 133.25487 139.75683 136.4530 139.5511 229.0484
               binom2(z) 168.38226 172.60000 177.95935 175.6447 180.9531 277.3501

Is there a way to make the code faster ?

I see one example here. But in my case the probabilities are in Armadillo matrix

Upvotes: 2

Views: 372

Answers (2)

Ari.stat
Ari.stat

Reputation: 472

Thank you so much. I also used this

   umat binom4(mat& prob){
     int n=prob.n_rows;
     mat temp(n,n,fill::randu);
     return (temp<prob);
   }

I think it is a bit more faster

microbenchmark(rbinom(nrow(z)^2,1,z),binom1(z),binom2(z),binom3(z),binom4(z))

               expr       min        lq        mean    median       uq     max       neval
rbinom(nrow(z)^2, 1, z)  94.24809  95.29728  97.24977  95.86829  98.19758 108.30877   100
              binom1(z) 130.20266 132.48951 138.07100 134.03693 137.34613 297.86393   100
              binom2(z) 164.96716 168.17024 175.89784 170.29310 173.93890 338.99306   100
              binom3(z)  64.57977  64.78340  67.03158  65.81533  67.42386  92.31300   100
              binom4(z)  29.66925  31.44107  32.81296  31.77392  33.31575  55.65539   100

Upvotes: 1

F. Priv&#233;
F. Priv&#233;

Reputation: 11728

Given the nearly-duplicate answer, you can use:

mat binom3(const mat& prob) {

  int n = prob.n_rows;
  mat sample(n, n);

  std::transform(prob.begin(), prob.end(), sample.begin(), 
                 [=](double p){ return R::rbinom(1, p); });

  return sample;
}

Microbenchmark:

Unit: milliseconds
                    expr      min       lq      mean   median        uq       max neval
 rbinom(length(z), 1, z) 46.88264 47.28971  48.09543 47.66346  48.40734  65.29790   100
               binom1(z) 76.98416 82.60813  84.93669 83.51432  84.04780 126.46992   100
               binom2(z) 96.20707 98.59145 101.99215 99.56175 102.02750 153.04754   100
               binom3(z) 34.01417 34.49066  35.12199 34.93946  35.47979  38.22539   100

Upvotes: 1

Related Questions