danas.zuokas
danas.zuokas

Reputation: 4643

When calling same Rcpp function several times different results are returned

I have written parallel implementation of sums in groups using RcppParallel.

// [[Rcpp::depends(RcppParallel)]]
#include <Rcpp.h>
#include <RcppParallel.h>
using namespace Rcpp;
using namespace RcppParallel;

struct SumsG: public Worker
{
  const RVector<double> v;
  const RVector<int> gi;

  RVector<double> sg;

  SumsG(const NumericVector v, const IntegerVector gi, NumericVector sg): v(v), gi(gi), sg(sg) {}
  SumsG(const SumsG& p, Split): v(p.v), gi(p.gi), sg(p.sg) {}

  void operator()(std::size_t begin, std::size_t end) {
    for (std::size_t i = begin; i < end; i++) {
      sg[gi[i]] += v[i];
    }
  }

  void join(const SumsG& p) {
    for(std::size_t i = 0; i < sg.length(); i++) {
      sg[i] += p.sg[i];
    }
  }
};

// [[Rcpp::export]]
List sumsingroups(NumericVector v, IntegerVector gi, int ni) {
  NumericVector sg(ni);
  SumsG p(v, gi, sg);
  parallelReduce(0, v.length(), p);

  return List::create(_["sg"] = p.sg);
}

It compiles using Rcpp::sourceCpp. Now when I call it from R sumsingroups(1:10, rep(0:1, each = 5), 2) several times I get the right answer (15 40) and then something different (usually some multiplicative of the right answer). Running

res <- sumsingroups(1:10, rep(0:1, each = 5), 2)
for(i in 1:1000) {
    tmp <- sumsingroups(1:10, rep(0:1, each = 5), 2)
    if(res[[1]][1] != tmp[[1]][1]) break
    Sys.sleep(0.1)
}

breaks at random iteration returning

$sg
[1]  60 160

or

$sg
[1] 30 80

I am new to Rcpp and RcppParallel and do not know what could cause such behavior.


Update. Things that did not help:

  1. Added for (std::size_t i = 0; i < sg.length(); i++) sg[i] = 0; to both of constructors.
  2. Changed names so that they are different in Worker definition and in function implementation.

Upvotes: 4

Views: 385

Answers (1)

mpiktas
mpiktas

Reputation: 11518

Try this.

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>

using namespace RcppParallel;
struct SumsInGroups5: public Worker
{
    const RVector<double> v;
    const RVector<int> g;

    std::vector<double> s;

    SumsInGroups5(const NumericVector v, const IntegerVector g): v(v), g(g),  s(*std::max_element(g.begin(), g.end()) + 1, 0.0){ }

    SumsInGroups5(const SumsInGroups5& p, Split): v(p.v), g(p.g), s(*std::max_element(g.begin(), g.end()) + 1, 0.0) {}

    void operator()(std::size_t begin, std::size_t end) {
        for (std::size_t i = begin; i < end; ++i) {
            s[g[i]]+=v[i];
        }

    }

    void join(const SumsInGroups5& rhs) {
        for(std::size_t i = 0; i < s.size(); i++) {
            s[i] += rhs.s[i];
        }
    }
};

// [[Rcpp::export]]
NumericVector sg5(NumericVector v, IntegerVector g) {
    SumsInGroups5 p(v, g);
    parallelReduce(0, v.length(), p);
    return wrap(p.s);
}

/*** R
a <- 1:10
g <- c(rep(0,5),rep(1,5))


bb <- lapply(1:10000,function(x)sg5(a,g))
cc<-do.call("rbind",bb)
unique(cc)
*/

Compared to my other tries this code did not produce weird result in the same cases other code did. Not very assuring.

Upvotes: 1

Related Questions