Testing value identity in Rcpp

I'm a relatively green Rcpp user, and I'm unsure how to test if two values are identical.

For example, the following function is meant to test if a value is contained within a list, but returns incorrect results for simple test cases

#include <Rcpp.h>
using namespace Rcpp;    

// [[Rcpp::export]]
LogicalVector is_member (SEXP val, List coll) {    

    int coll_len = coll.size();    

    if (coll_len == 0) {
        return LogicalVector::create();
    } else {    

        Function identical("identical");    

        for (int ith = 0; ith < coll_len; ++ith) {    

            SEXP elem = coll[ith];    

            if (identical(val, elem)) {
                return true;
            }
        }    

        return false;
    }
}

is_member(1L, list(1L))
# FALSE
is_member(NaN, list(NaN, NaN))
# False

Why is this, and how can you test for identity with the same corner cases and durability of the base function 'identical'? I couldn't find any Rcpp sugar for this purpose, but if I can't find a direct solution I suspect unordered sets or the unique function might be used to test for identity.

If my C++ is non-idiomatic / dangerous I would also appreciate feedback, and if I've been to vague please leave a comment below and I'll amend my question.

Thanks

Upvotes: 2

Views: 137

Answers (1)

Kevin Ushey
Kevin Ushey

Reputation: 21315

Huh, looks like you stumbled on a little bug -- we aren't converting bool to LogicalVector in the expected way.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
LogicalVector return_true() {
  return true;
}

/*** R
return_true()
*/

gives

> return_true()
[1] FALSE

So maybe just return LogicalVector::create(true) for now?

One other big problem in your code: identical returns a SEXP, not a bool! You want to get the result as a bool explicitly:

Shield<SEXP> result(identical(val, elem));
if (LOGICAL(result)[0]) { ... }

Although assigning the result directly to bool should work, it seems like maybe it doesn't. There are still traps like this when calling back to R.

That said, other comments on your code:

  1. Generally, calling back to R is slow*, so don't do it unless you have to,
  2. Don't use raw SEXPs unless you know what you're doing. Here you are getting lucky since children of Lists are implicitly protected, but in general this is unsafe.
  3. Just call your iterator index i, that is the most common format (feels weird seeing ith).

It turns out there is a C API for identical available as well. In RInternals.h, we have

/* R_compute_identical:  C version of identical() function
   The third arg to R_compute_identical() consists of bitmapped flags for non-default options:
   currently all default to TRUE, so the flag is set for FALSE values:
   1 = !NUM_EQ
   2 = !SINGLE_NA
   4 = !ATTR_AS_SET
   8 = !IGNORE_BYTECODE
*/
Rboolean R_compute_identical(SEXP, SEXP, int);

so you may want to just use that instead.

Upvotes: 4

Related Questions