Melissa Key
Melissa Key

Reputation: 4551

Strange behavior when incrementally sampling using RcppArmadillo::sample

I'm trying to implement some draws using a polya urn scheme using Rcpp. Basically, I have a matrix I'm drawing from, and a 2nd matrix with weights proportional to the probabilities. After each draw, I need to increase the weight of whichever cell I drew.

I was running into some indexing errors which lead me to examine the sampling more generally, and I found that my weight matrix was getting modified by RcppArmadillo::sample. Two questions (1) is this behavior that I should have expected or is this a bug which I should report somewhere? (2) Any ideas on current work-around? Here's a reproducible example:

#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp ;

// [[Rcpp::export]]
void sampler(int N, int inc, NumericMatrix& weight_matrix, int reps) {

  IntegerVector wm_tmp = seq_along(weight_matrix);
  Rcout << "Initial weight_matrix:\n" << weight_matrix << "\n";

  int x_ind;
  for(int i = 0; i < reps; ++i) {
    x_ind = RcppArmadillo::sample(wm_tmp, 1, true, weight_matrix)(0) - 1;
    Rcout << "Weight matrix after sample: (rep = " << i << ")\n" << weight_matrix << "\n";
    Rcout << "x_ind: " << x_ind  << "\n"; 
    // get indices
    weight_matrix[x_ind] = weight_matrix[x_ind] +  inc;
    Rcout << "Add increment of " << inc << " to weight_matrix:\n" << weight_matrix << "\n";
  }
}
// 
// // [[Rcpp::export]]
// IntegerVector seq_cpp(IntegerMatrix x) {
//   IntegerVector tmp = seq_along(x);
//   IntegerVector ret = RcppArmadillo::sample(tmp, 2, true);
//   return ret;
// }

/*** R
weight_matrix <- matrix(1, 5, 2)
sampler(5, 1, weight_matrix, 3)

weight_matrix <- matrix(1, 5, 2)
sampler(5, 0, weight_matrix, 3)
*/

Thanks!

Upvotes: 0

Views: 55

Answers (1)

Dirk is no longer here
Dirk is no longer here

Reputation: 368201

That is known and documented behaviour.

You could do

i) Use Rcpp::clone() to create a distinct copy of your SEXP (ie NumericMatrix).

ii) Use an Armadillo matrix instead and pass as const arma::mat & m.

There are architectural reasons having to do with the way R organizes its data structure which mean that we cannot give you fast access (no copies!) and also protect against writes.

Upvotes: 1

Related Questions