Reputation: 517
I am trying to rewrite an R function (Fourier smoothing) in Rcpp for faster computation. My Rcpp function is not returning the desired values.
I have a vector
x = c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5, 5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887, 5334.5, 5782, 5501, 5524.5, 5548)
My Rcpp function
cppFunction("
NumericVector smo(NumericVector x){
int n = x.size();
NumericVector realpart1(5);
NumericVector imagpart1(5);
NumericVector sm1(n);
for (int i = 0; i<5; i++){
double realpart = 0;
double imagpart = 0;
for (int j = 0; j<n; j++) {
realpart = realpart + 0.07142857*x[j]*cos(2 * 3.142857 * (i+1-1) * (j+2)/28);
imagpart = imagpart + 0.07142857 * x[j] * sin(2 * 3.142857 * (i+1 - 1) * (j+2) /28);
}
realpart1[i]=realpart;
imagpart1[i] = imagpart;
}
for (int j = 0; j<n; j++){
double sm = realpart1[0]/2;
for (int i=0; i<5; i++){
sm = sm + realpart1[i]*cos(2 * 3.142857 * (i+1 - 1) * (j+2) / 28) + imagpart1[i]*sin(2 * 3.142857 * (i+1-1) * (j+2) / 28);
}
sm1[j] = sm;
}
return sm1;
}
")
Output of the function smo is coming like below
16804.81 16674.97 16518.58 16425.55 16453.36 16594.95 16780.77 16914.47
16922.49 16789.76 16563.30 16324.47 16147.96 16070.53 16083.19 16145.65
16210.29 16241.81 16226.19 16170.64 16099.70 16049.52 16058.45 16152.36
16328.20 16545.64 16736.58 16833.36
If I subtract value 10949.12
from the output of function(smo)
I am getting the desired result like below
Desired output
5855.689 5725.846 5569.459 5476.428 5504.237 5645.833 5831.647 5965.351
5973.369 5840.640 5614.181 5375.346 5198.844 5121.412 5134.069 5196.534
5261.174 5292.694 5277.066 5221.517 5150.584 5100.398 5109.330 5203.243
5379.080 5596.524 5787.462 5884.235
The value 10949.12
is the first value of NumericVector realpart1
I am not able to resolve this issue as I am trying Rcpp for the first time. I have checked loops various times, up to the calculation of realpart1
and imagpart1
loop is working fine... There is some problem with the second loop but I am not able to figure out why the value 10949.12
is being added in the output.
I will really appreciate any help in this regard.
equivalent R Code
har = 4
pi = 22/7
realpart1 = c()
imagpart1 = c()
for (p in 1:(har+1)){
realpart = 0
imagpart = 0
for (i in 1:length(x)){
realpart = realpart + (2 /length(x)) * x[i] * cos(2 * pi * (p - 1) * (i+1) / length(x))
imagpart = imagpart + (2 / length(x)) * x[i] * sin(2 * pi * (p - 1) * (i+1) / length(x))
}
realpart1 = c(realpart1,realpart)
imagpart1 = c(imagpart1,imagpart)
#print(realpart)
#print(imagpart)
}
sm1 = c()
for (i in 1:length(x)){
sm = realpart1[1]/2
for (p in 2:(har+1)){
sm = sm + realpart1[p]*cos(2 * pi * (p - 1) * (i+1) / length(x))+ imagpart1[p]*sin(2 * pi * (p - 1) * (i+1) / length(x))
}
sm1 = c(sm1,sm)
}
Upvotes: 0
Views: 247
Reputation: 26843
There is a difference in the limits of the nested for
loop in the second for
loop. In R it goes from 2 to 5, while in C++ it goes from 0 to 4. It should go from 1 to 4 in C++ to be comparable with R.
However, you can probably make the R code faster by avoiding dynamically growing vectors inside the loop. In a for
loop that is almost never necessary, since you know the size of the resulting vector beforehand and can use e.g. realpart <- numeric(length = har + 1)
and realpart[p] <- ...
.
However, in this case one can go even further and formulate the problem in terms of matrices and linear algebra, avoiding the (explicit) loops altogether:
x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
pi <- 22 / 7 # this should be removed!
phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) + 1) / length(x)
real <- 2 / length(x) * cos(phase) %*% x
imag <- 2 / length(x) * sin(phase) %*% x
y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
as.numeric(y - real[1]/2)
}
fourier_smooth(x, 4)
#> [1] 5855.695 5725.852 5569.463 5476.432 5504.240 5645.837 5831.651
#> [8] 5965.355 5973.373 5840.644 5614.185 5375.350 5198.848 5121.417
#> [15] 5134.073 5196.538 5261.177 5292.697 5277.070 5221.522 5150.588
#> [22] 5100.402 5109.334 5203.247 5379.084 5596.528 5787.468 5884.242
Created on 2019-08-13 by the reprex package (v0.3.0)
Note that I am including the redefinition of pi only to reproduce your desired result. For correct results, the real value of pi should be used.
However, it is even faster to use R's build in FFT:
x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) - 1) / length(x)
real <- 2 / length(x) * cos(phase) %*% x
imag <- 2 / length(x) * sin(phase) %*% x
y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
as.numeric(y - real[1]/2)
}
fourier_smooth2 <- function(x, har) {
y <- fft(x, inverse = TRUE) / length(x)
y[(har+2):(length(x)-har)] <- 0 # filter higher harmonics while keeping the symmetry for real input
Re(fft(y)) # result is already real
}
bench::mark(fourier_smooth(x, 4), fourier_smooth2(x, 4))[1:5]
#> # A tibble: 2 x 5
#> expression min median `itr/sec` mem_alloc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt>
#> 1 fourier_smooth(x, 4) 31.66µs 34.97µs 26342. 4.13MB
#> 2 fourier_smooth2(x, 4) 4.82µs 5.49µs 152845. 3.98KB
Created on 2019-08-13 by the reprex package (v0.3.0)
Upvotes: 3