Heisenberg
Heisenberg

Reputation: 8806

Does RcppArmadillo's sample require argument to be instantiated beforehand?

I'm using RcppArmadillo::sample in my Rcpp code, which has this strange behavior below. fun_good works as expected, sampling 1 element from the x vector. However, fun_bad does not work even though the only difference is that I'm not creating the source vector x beforehand.

#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector fun_good() {
  IntegerVector x = seq_len(5);
  IntegerVector newOffer = RcppArmadillo::sample(x, 1, true);
  return newOffer;
}

// [[Rcpp::export]]
IntegerVector fun_bad() {
  IntegerVector newOffer = RcppArmadillo::sample(seq_len(5), 1, true);
  return newOffer;
}

The error message is lvalue required as left operand of assignment, and points to the following source. Why wouldn't ret[ii] be assignable in fun_bad?

 // copy the results into the return vector
        for (ii=0; ii<size; ii++) {
            jj = index[ii];
            ret[ii] = x[jj];
        }
        return(ret);

Upvotes: 3

Views: 135

Answers (2)

nrussell
nrussell

Reputation: 18602

TL;DR

Make an explicit cast (as cdeterman has done) or explicit constructor call:

// [[Rcpp::export]]
Rcpp::IntegerVector fun_bad() {
    Rcpp::IntegerVector newOffer = 
        RcppArmadillo::sample(Rcpp::IntegerVector(seq_len(5)), 1, true);
    return newOffer;
}

Don't quote me on the specifics, but I'm pretty sure you've encountered an edge case of Expression Templates not playing nicely with template type deduction rules. First, the relevant part of the error message emitted by my compiler:

... In instantiation of ‘T Rcpp::RcppArmadillo::sample(const T&, int, bool, Rcpp::NumericVector) [with T = Rcpp::sugar::SeqLen; ...

Hence, in the templated sample function, T is deduced as having type Rcpp::sugar::SeqLen.

SeqLen is an expression template class -- defined here -- which, under most circumstances, would be (implicitly) converted to an Rcpp::IntegerVector (due to its inheritance from Rcpp::VectorBase<INTSXP, ...>). For example,

// [[Rcpp::export]]
Rcpp::IntegerVector test(int n = 5) {
    return Rcpp::seq_len(5); // Ok
} 

However, since implicit conversions are part of the overload resolution process, and not the template type deduction process, T is deduced exactly as Rcpp::sugar::SeqLen -- meaning this expression

ret[ii] = x[jj];

is calling Rcpp::sugar::SeqLen::operator[] (and not Rcpp::Vector::operator[] as is typically the case), which produces an rvalue (see below†).

You might have noticed that unlike some of the ET sugar classes, SeqLen is more of a "true" expression template in that it just provides an operator[] for being lazily evaluated. It does not store a constant reference data member / provide a vector conversion operator (as, e.g. cumprod and many others do); it is literally used to construct a vector -- this constructor if I'm not mistaken,

template <bool NA, typename VEC>
Vector( const VectorBase<RTYPE,NA,VEC>& other ) {
    RCPP_DEBUG_2( "Vector<%d>( const VectorBase<RTYPE,NA,VEC>& ) [VEC = %s]", RTYPE, DEMANGLE(VEC) )
    import_sugar_expression( other, typename traits::same_type<Vector,VEC>::type() ) ;
} 

which uses the following helper methods defined in the Vector class:

// we are importing a real sugar expression, i.e. not a vector
template <bool NA, typename VEC>
inline void import_sugar_expression( const Rcpp::VectorBase<RTYPE,NA,VEC>& other, traits::false_type ) {
    RCPP_DEBUG_4( "Vector<%d>::import_sugar_expression( VectorBase<%d,%d,%s>, false_type )", RTYPE, NA, RTYPE, DEMANGLE(VEC) ) ;
    R_xlen_t n = other.size() ;
    Storage::set__( Rf_allocVector( RTYPE, n ) ) ;
    import_expression<VEC>( other.get_ref() , n ) ;
} 

template <typename T>
inline void import_expression( const T& other, int n ) {
    iterator start = begin() ;
    RCPP_LOOP_UNROLL(start,other)
} 

At any rate, until Rcpp automagically generates an actual vector object from a sugar::SeqLen expression, it is not available for use (at least in the way that is required in this particular expression: ret[ii] = x[jj];).


†Just as a sanity check, we can use a few C++11 metaprogramming constructs to examine the difference between the return values of SeqLen::operator[] and Vector::operator[]:

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadilloExtensions/sample.h>

typedef decltype(Rcpp::sugar::SeqLen(1)[0]) rvalue_t;
typedef decltype(Rcpp::IntegerVector::create(1)[0]) lvalue_ref_t;

// [[Rcpp::export]]
void test() {
    // rvalue_t is an rvalue
    Rcpp::Rcout
        << std::is_rvalue_reference<rvalue_t&&>::value
        << "\n";
    // lvalue_ref_t is an lvalue
    Rcpp::Rcout
        << std::is_lvalue_reference<lvalue_ref_t>::value
        << "\n";

    // rvalue_t is _not_ assignable
    Rcpp::Rcout
        << std::is_assignable<rvalue_t, R_xlen_t>::value
        << "\n";
    // lvalue_ref_t is assignable
    Rcpp::Rcout
        << std::is_assignable<lvalue_ref_t, R_xlen_t>::value
        << "\n";
}

/*** R

test()
# 1     ## true
# 1     ## true
# 0     ## false
# 1     ## true

*/ 

Upvotes: 4

cdeterman
cdeterman

Reputation: 19960

Although I can't provide an explicit reason why it is happening but if you explicitly state that the seq_len output is an IntegerVector the function compiles and works as expected.

// [[Rcpp::export]]
IntegerVector fun_bad() {
  IntegerVector newOffer = RcppArmadillo::sample((IntegerVector)seq_len(5), 1, true);
  return newOffer;
}

I think it is because the seq_len call within sample is a temporary object and therefore an rvalue. Explicitly casting the result as an IntegerVector appears to make it an lvalue hence why it works. Again, why this is the case I do not know.

I'm sure Dirk or someone with more C++ knowledge than me will come by to give a more explicit answer eventually if you wait long enough.

Upvotes: 2

Related Questions