Reputation: 2530
I want to use a custom function over a moving window on a matrix. I have been using terra::focal
. This works well but R-based custom functions are quite slow and this represents a real bottleneck in my analysis. I am hoping to implement the custom function for terra::focalCpp
which uses Rcpp
instead, as this will be considerably faster.
Goal: I want to summarise the cells in the moving window, and check whether that summary value exceeds a pre-set threshold. If it exceeds the threshold, I want to return the original centre value of the window unchanged. If it does not exceed the threshold, then I want to return the minimum value present in the window unchanged.
I have written some code that works as expected for a single vector of numbers:
# Function which works ok on a single vector, assumes 27 values in vector.
cppFunction('double mean_or_min_cpp(NumericVector x, int threshold) {
int n = x.size();
NumericVector temp(n);
double mean2;
for(int i = 0; i < n; ++i) {
temp[i] = pow(10, (x[i]/10));
}
mean2 = 10 * log10(mean(temp));
if (mean2 > threshold) {
return x[13]; // Return the middle value in the vector
} else {
return min(x);
}
}')
However, I'm completely new to C++, and I'm not sure how I can integrate this function with the code needed for functions in terra::focalCpp
. The function compiles ok, but when I try to use it I get Error in .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: (nil)>, : NULL value passed as symbol address
That error seems to arise for a range of reasons and I don't have the knowledge to troubleshoot.
Any suggestions for where I'm going wrong would be really appreciated.
# Attempt at incorporating the above function into the example code provided with the terra::focalCpp help file.
# x, ni and nw are set internally by terra::focalCpp
cppFunction(
'NumericVector ALE2_cpp(NumericVector x, int threshold, size_t ni, size_t nw) {
// Create objects
NumericVector temp(nw); // The subset vector for moving window
NumericVector temp2(nw); // Duplicate of the window
double winsummary; // The value summarised from the window
NumericVector out(ni); // The output vector
size_t start = 0; // Start counter for loop
// Loop over all cells in raster
for (size_t i=0; i<ni; i++) {
start = i;
size_t end = start + nw;
temp = x(start, end); // Subset the main vector to just the window
// Loop over values of the window
for (size_t j=0; j<nw; j++) {
temp2[j] = pow(10, (temp[j]/10)); // Save modified values to new temporary vector
}
// Summarise the modified values to a single value
winsummary = 10 * log10(mean(temp2));
// Determine which value to return
if (winsummary > threshold) {
out[i] = x[i]; // either the original cell
} else {
out[i] = min(temp); // or the minimum in the window
}
}
return out;
}'
)
Example data
library(terra)
library(Rcpp)
# Create example data
set.seed(42)
temp.matrix <- matrix(data = sample(1:30, 600000, replace = TRUE), nrow = 200)
# Convert matrix to a spatRaster
temp.rast <- terra::rast(temp.matrix)
# Moving window
temp.rast.smoothed <- terra::focalCpp(temp.rast, w = c(9,3), fun = trial_cpp, threshold = 10, fillvalue = 0)
Upvotes: 2
Views: 382
Reputation: 141
I would start with simpler C++ code that you are confident will work to make sure your R code is working. Something like this:
NumericVector focal_mean_or_min(NumericVector x, int threshold, size_t ni, size_t nw) {
NumericVector out(ni);
for (size_t i = 0; i < ni; i++) {
out[i] = i;
}
return out;
}
If that works, then I would slowly add to it until I get something like the code below. The code below assumes that ni+nw <= x.size(). You should probably add code to fix or prevent the case where ni+nw > x.size().
NumericVector focal_mean_or_min(NumericVector x, int threshold, size_t ni, size_t nw) {
NumericVector out(ni);
NumericVector temp(nw);
size_t n = x.size();
double min_x = min(x); // calculate this value once outside the for loop
for (size_t i = 0; i < ni; i++) {
// Calculate the start/stop indices of the window
size_t start = i;
size_t end = i + nw;
if (end > n) {
end = n; // if this line is called, then the mean calculated in the line "double mean2 = mean(temp)" will be wrong
}
// Iterate over the window
for (size_t j = start; j < end; j++) {
temp[j-start] = pow(10, x[j] / 10);
}
double mean2 = 10 * log10(mean(temp));
if (mean2 > threshold) {
out[i] = x[13];
}
else {
out[i] = min_x;
}
}
return out;
}
You'll notice a few of the changes I made:
Update:
I've added another snippet of code to reflect the goal added to the OP's question.
Note that this code just ignores ni
because we want the output to be the same size as x
. I also adjusted the window so that i
points to the center value of the current window. If the window is at the edges of x
, then the window is truncated. As a result, the length of the window is sometimes shorter than nw
. This is taken into account when calculating the mean.
Finally, I use a running accumulator to calculate the mean instead of creating the vectors temp
and temp2
. This should be faster.
NumericVector ALE2_cpp(NumericVector x, int threshold, size_t ni, size_t nw) {
// Calculate some values outside the "for" loop so they are done once
size_t n = x.size();
NumericVector out(n); // Set the output vector to the same length as the input vector "x"
double max_double = 1e300 * 1e300; // should overflow to infinity
// Loop over all cells in raster
for (size_t i = 0; i < n; i++) {
// "i" is the index of the current cell
// "start" is the index of the start of the current window
// "end" is the index of the end of the current window
int start = int(i) - (int(nw) / 2); // This sets "i" to be the center of the window if "nw" is odd
int end = start + int(nw);
// Truncate the window instead of padding the input
if (start < 0) {
start = 0;
}
if (end > n) {
end = int(n);
}
// Loop over values of the window finding the minimum and the mean
double accumulator = 0;
double curr_window_min = max_double;
for (size_t j = start; j < end; j++) {
// Search for the minimum value in the current window
if (x[j] < curr_window_min) {
curr_window_min = x[j];
}
// Keep a running total to calculate the mean later
accumulator += pow(10, (x[j] / 10)); // convert from decibels
}
// Calculate the mean from the accumulator value and the current window length
double mean_db = 10 * log10( accumulator / (end - start) );
// Determine which value to return
if (mean_db > threshold) {
out[i] = x[i]; // either the original cell
}
else {
out[i] = curr_window_min; // or the minimum of the current window
}
}
return out;
}
Upvotes: 2