Reputation: 108
The following code compiles and executes correctly, but everytime I run it, my R session gets a fatal error shortly after it finishes. I'm running R version 3.3.2 and Rtools 3.3.
Is there anything I missed? How can I trace what's causing the crash?
#include<Rcpp.h>
using namespace Rcpp;
NumericMatrix dupCheckRcpp(NumericMatrix x) {
int nrow, ncol;
int i, j, k, m, n;
bool flag;
NumericMatrix dupMat(300,ncol);
n = 0;
nrow = 0; ncol = 0;
nrow = x.nrow();
ncol = x.ncol();
for (i = 0; i < nrow - 1 ; ++i) {
for (j = i + 1; j < nrow; ++j) {
flag = TRUE;
for (k = 0; k < ncol; ++k) {
if (x(i,k) != x(j,k)) {
flag = FALSE;
break;
}
}
if (flag == TRUE) {
for (m = 0; m < ncol; ++m) {
dupMat(n,m) = x(i,m);
}
n = n + 1;
}
}
}
return dupMat;
}
Upvotes: 0
Views: 410
Reputation: 20746
There are a few issues that are problematic with your code. We begin by looking at how the result matrix is defined, use of bool
, and then detail issues with undefined behavior (UB) as a result of the matrix subset.
The definition of:
NumericMatrix dupMat(300, ncol);
has two issues:
ncol
has been initializedx
matrix has nrow
fixed at 300.Move the instantiation of the dupMat
till after ncol
and nrow
are initialized. Alternatively, move it until after you know the amount of duplicate rows.
nrow = x.nrow();
ncol = x.ncol();
Rcpp::NumericMatrix dupMat(nrow, ncol);
In addition, bool
values within C++ are written in lower case.
That is, use true
in place of TRUE
and false
in place of FALSE
while setting the values of flag
variable.
There are three ways to access individual elements in a NumericMatrix
, however, we'll only focus on two of them that use i,j
indices.
(i,j)
: Accessing elements in this manner forgoes a bounds check and subsequent exception flag that warns if the point is not within range. In essence, this access method was causing a UB since n = n + 1
could easily go beyond the row index. Probably the UB caused havoc at a later point when RStudio or R ran a background task causing the crash to happen. .at(i,j)
: This is the preferred method as it provides a bounds check and throws a nifty exception e.g. Error in dupCheckRcpp(a) : index out of bounds
Which is triggered by the following code snippet:
if (flag == true) {
for (m = 0; m < ncol; ++m) {
Rcpp::Rcout << "dupMat (" << n << ","<< m << ")" << std::endl <<
"x (" << i << ","<< m << ")" << std::endl;
dupMat.at(n, m) = x.at(i, m);
}
n = n + 1; // able to exceed nrow.
}
The main reason for n = n + 1
to be hitting the upper bound is due to the placement being within the second for
loop that gets re-instantiated each time.
Without knowing your intent behind the duplicate check, outside of guessing that it is checking for duplication that may be present within matrix rows. I'm going to stop here.
Upvotes: 4