HBat
HBat

Reputation: 5702

Checking Null and NA of a vector in Rcpp

I'm trying to evaluate the sum of a vector (y) conditional on whether the values of a second nullable vector (r) is NA or not. If the second vector r is NULL, all of the values of y should be summed. If all elements of r is NA, function should return NA. Please see the end of the text for the desired output.

I tried the following code first:

library(Rcpp)
cppFunction('double foo(NumericVector y, Rcpp::Nullable<Rcpp::IntegerVector> r = R_NilValue) {
  double output = 0;
  bool return_na = !Rf_isNull(r);
  int y_count = y.size();
  for (int i = 0; i < y_count; i++) {
    if (Rf_isNull(r)  || !R_IsNA(r[i])) {
    //// if (Rf_isNull(r)  || !R_IsNA(as<IntegerVector>(r)[i])) {
      if (!Rf_isNull(r))
        Rcout << R_IsNA(as<IntegerVector>(r)[i]) << " - "<< as<IntegerVector>(r)[i] << std::endl;
      output = output + y[i];
      return_na = false;
    } 
  }
  if (return_na) 
    return NA_REAL;
  return output;
}')

This gave me the following error:

 error: invalid use of incomplete type 'struct SEXPREC'
     if (Rf_isNull(r)  || !R_IsNA(r[i])) {
                                     ^

In order to solve it, I used if (Rf_isNull(r) || !R_IsNA(as<IntegerVector>(r)[i])) { instead. But this time, when converting to an integer vector, NA values are converted to a number and R_IsNA() test gives a false positive.

Here is the expected output that I want.

foo(1:4, NULL) #  <- This should return 10 = 1 + 2 + 3 + 4
foo(1:4, c(1, 1, 1, 1)) #  <- This should return 10 = 1 + 2 + 3 + 4
foo(1:4, c(1, 1, NA, 1)) #  <- This should return 7 = 1 + 2 + 4
foo(1:4, c(NA, NA, NA, NA)) # <- This should return NA

How can I get the function that I want? (This example is simplified, I'm not particularly interested in sum function. Instead, I'm interested in checking NA and NULL simultaneously as given in the example.)

Upvotes: 2

Views: 788

Answers (1)

Ralf Stubner
Ralf Stubner

Reputation: 26833

Three suggestions:

  • Use Rcpp instead of R's C API.
  • Return early when r is NULL.
  • Create a LogicalVector before looping through the input vector.
#include <Rcpp.h>

// [[Rcpp::export]]
double foo(Rcpp::NumericVector y, Rcpp::Nullable<Rcpp::IntegerVector> r = R_NilValue) {
    if (r.isNull())
        return Rcpp::sum(y);

    Rcpp::LogicalVector mask = Rcpp::is_na(r.as());
    if (Rcpp::is_true(Rcpp::all(mask))) 
        return NA_REAL;

    double output = 0.0;
    int y_count = y.size();
    for (int i = 0; i < y_count; ++i) {
        if (!mask[i]) {
            output += y[i];
        } 
    }
    return output;
}

/***R
foo(1:4, NULL) #  <- This should return 10 = 1 + 2 + 3 + 4
foo(1:4, c(1, 1, 1, 1)) #  <- This should return 10 = 1 + 2 + 3 + 4
foo(1:4, c(1, 1, NA, 1)) #  <- This should return 7 = 1 + 2 + 4
foo(1:4, c(NA, NA, NA, NA)) # <- This should return NA
*/ 

Result:

> Rcpp::sourceCpp('60569482.cpp')

> foo(1:4, NULL) #  <- This should return 10 = 1 + 2 + 3 + 4
[1] 10

> foo(1:4, c(1, 1, 1, 1)) #  <- This should return 10 = 1 + 2 + 3 + 4
[1] 10

> foo(1:4, c(1, 1, NA, 1)) #  <- This should return 7 = 1 + 2 + 4
[1] 7

> foo(1:4, c(NA, NA, NA, NA)) # <- This should return NA
[1] NA

Further suggestion:

  • Use the mask for sub-setting y.
#include <Rcpp.h>

// [[Rcpp::export]]
double foo(Rcpp::NumericVector y, Rcpp::Nullable<Rcpp::IntegerVector> r = R_NilValue) {
    if (r.isNull())
        return Rcpp::sum(y);

    Rcpp::LogicalVector mask = Rcpp::is_na(r.as());
    if (Rcpp::is_true(Rcpp::all(mask))) 
        return NA_REAL;

    Rcpp::NumericVector tmp = y[!mask];
    return Rcpp::sum(tmp);
}

Upvotes: 6

Related Questions