Reputation: 41
I wrote the following code to erase zeros from the vector. I use the erase(i)
function from the Rcpp library.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector erase_zero(NumericVector x) {
for (int i = 0; i < x.size(); i++) {
if (x[i] == 0) {
x.erase(i);
}
}
return x;
}
Everything is fine, now the problem is the output of the function, i.e.
> erase_zero(c(0,1,2,3,0))
[1] 1 2 3
> erase_zero(c(0,0,1,2,3,0,0))
[1] 0 1 2 3 0
> erase_zero(c(0,0,0,1,2,3,0,0,0))
[1] 0 1 2 3 0
> erase_zero(c(0,0,0,0,1,2,3,0,0,0,0))
[1] 0 0 1 2 3 0 0
I don't know why this is happening.
after reading all the answers below, I simply tried the speed test
> microbenchmark(erase_zero(s), erase_zero1(s), erase_zero_sugar(s))
Unit: microseconds
expr min lq mean median uq max neval
erase_zero(s) 19.311 21.2790 22.54262 22.181 22.8780 35.342 100
erase_zero1(s) 18.573 21.0945 21.95222 21.771 22.4680 36.490 100
erase_zero_sugar(s) 1.968 2.0910 2.57070 2.296 2.5215 24.887 100
erase_zero1
is Roland's first code. Also, ThomasIsCoding's R base is more efficient than all.
Upvotes: 4
Views: 126
Reputation: 102599
Here is a benchmarking with a bunch of Rcpp
approaches vs the base R subsetting, and you will see that the base R approach x[x!= 0]
is the most efficient already.
library(Rcpp)
sourceCpp(
code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector f0(NumericVector x) {
int i = 0;
while (i < x.size()) {
if (x[i]==0) {
x.erase(i);
} else {
i++;
}
}
return x;
}
"
)
sourceCpp(
code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector f1(NumericVector x) {
R_xlen_t n = x.size();
for (R_xlen_t i = 0; i < n; i++) {
if (x[i] == 0) {
x.erase(i);
i--;
n--;
}
}
return x;
}
"
)
sourceCpp(
code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector f2(NumericVector x) {
R_xlen_t n = x.size();
NumericVector res;
for (R_xlen_t i = 0; i < n; i++) {
if (x[i] != 0) {
res.push_back(x[i]);
}
}
return res;
}
"
)
sourceCpp(
code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector f3(NumericVector x) {
return x[x != 0];
}
"
)
and the code for comparison
set.seed(0)
x <- sample(0:5, 1e5, replace = TRUE)
microbenchmark(
fwhile = f0(x),
ffor1 = f1(x),
ffor2 = f2(x),
fsuger = f3(x),
baseR = x[x != 0],
unit = "relative",
times = 10L
)
Unit: relative
expr min lq mean median uq max
fwhile 4574.766987 3877.57877 2491.634303 3541.983516 2149.808409 1152.438181
ffor1 4204.952786 3690.07333 2340.518164 3275.927345 2060.156985 1117.993311
ffor2 8270.203280 7302.53550 4754.341310 6746.984478 4158.206201 2221.732950
fsuger 1.236079 1.13896 1.299927 1.110674 1.091769 1.579036
baseR 1.000000 1.00000 1.000000 1.000000 1.000000 1.000000
neval
10
10
10
10
10
Upvotes: 2
Reputation: 102599
The size of your x
is dynamically changing when committing erase
. you can try while
like below
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector erase_zero(NumericVector x) {
int i = 0;
while (i < x.size()) {
if (x[i]==0) {
x.erase(i);
} else {
i++;
}
}
return x;
}
library(Rcpp)
sourceCpp(
code = "
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector erase_zero(NumericVector x) {
int i = 0;
while (i < x.size()) {
if (x[i]==0) {
x.erase(i);
} else {
i++;
}
}
return x;
}
"
)
x <- c(0, 0, 5, 0, 1, 2, 3, 0, 6, 0, 0)
erase_zero(x)
and you will see
[1] 5 1 2 3 6
Upvotes: 3
Reputation: 132969
erase
changes the size of the vector. This gives the expected output.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector erase_zero(NumericVector x) {
R_xlen_t n = x.size();
for (R_xlen_t i = 0; i < n; i++) {
if (x[i] == 0) {
x.erase(i);
i--;
n--;
}
}
return x;
}
/*** R
erase_zero(c(0,1,2,3,0))
erase_zero(c(0,0,1,2,3,0,0))
erase_zero(c(0,0,0,1,2,3,0,0,0))
erase_zero(c(0,0,0,0,1,2,3,0,0,0,0))
*/
However, you should just use some Rcpp sugar. It is more efficient:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector erase_zero_sugar(NumericVector x) {
return x[x != 0];
}
You should also read Why are these numbers not equal.
Upvotes: 5