EcologyTom
EcologyTom

Reputation: 2530

Implement custom Cpp function for use in terra::focalcpp in R

I want to use a custom function over a moving window on a matrix. I have been using terra::focal. This works well but R-based custom functions are quite slow and this represents a real bottleneck in my analysis. I am hoping to implement the custom function for terra::focalCpp which uses Rcpp instead, as this will be considerably faster.

Goal: I want to summarise the cells in the moving window, and check whether that summary value exceeds a pre-set threshold. If it exceeds the threshold, I want to return the original centre value of the window unchanged. If it does not exceed the threshold, then I want to return the minimum value present in the window unchanged.

I have written some code that works as expected for a single vector of numbers:

# Function which works ok on a single vector, assumes 27 values in vector.

cppFunction('double mean_or_min_cpp(NumericVector x, int threshold) {
   int n = x.size();
   NumericVector temp(n);
   double mean2;
   
   for(int i = 0; i < n; ++i) {
    temp[i] = pow(10, (x[i]/10));
   }
   
   mean2 = 10 * log10(mean(temp));
   
   if (mean2 > threshold) {
    return x[13];               // Return the middle value in the vector
  } else {
    return min(x);
  }
  
}')

However, I'm completely new to C++, and I'm not sure how I can integrate this function with the code needed for functions in terra::focalCpp. The function compiles ok, but when I try to use it I get Error in .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: (nil)>, : NULL value passed as symbol address That error seems to arise for a range of reasons and I don't have the knowledge to troubleshoot.

Any suggestions for where I'm going wrong would be really appreciated.

# Attempt at incorporating the above function into the example code provided with the terra::focalCpp help file.
# x, ni and nw are set internally by terra::focalCpp

cppFunction( 
  'NumericVector ALE2_cpp(NumericVector x, int threshold, size_t ni, size_t nw) {
      
      // Create objects
      NumericVector temp(nw); // The subset vector for moving window
      NumericVector temp2(nw); // Duplicate of the window
      double winsummary;      // The value summarised from the window
      NumericVector out(ni);  // The output vector
      size_t start = 0;       // Start counter for loop

        // Loop over all cells in raster
        
        for (size_t i=0; i<ni; i++) {
            start = i;
            size_t end = start + nw;
            temp = x(start, end);  // Subset the main vector to just the window

        // Loop over values of the window
            for (size_t j=0; j<nw; j++) {
                temp2[j] = pow(10, (temp[j]/10)); // Save modified values to new temporary vector
            }
            
            // Summarise the modified values to a single value
            winsummary = 10 * log10(mean(temp2));
            
            // Determine which value to return
            if (winsummary > threshold) {
             out[i] = x[i];                // either the original cell
              } else {
             out[i] = min(temp);           // or the minimum in the window
              }

        }
        return out;
    }'
)


Example data

library(terra)
library(Rcpp)

# Create example data
set.seed(42)
temp.matrix <- matrix(data = sample(1:30, 600000, replace = TRUE), nrow = 200)

# Convert matrix to a spatRaster
temp.rast <- terra::rast(temp.matrix)

# Moving window
temp.rast.smoothed <- terra::focalCpp(temp.rast, w = c(9,3), fun = trial_cpp, threshold = 10, fillvalue = 0)

Upvotes: 2

Views: 382

Answers (1)

Joe Mac
Joe Mac

Reputation: 141

I would start with simpler C++ code that you are confident will work to make sure your R code is working. Something like this:

NumericVector focal_mean_or_min(NumericVector x, int threshold, size_t ni, size_t nw) {
    NumericVector out(ni);

    for (size_t i = 0; i < ni; i++) {
        out[i] = i;
    }

    return out;
}

If that works, then I would slowly add to it until I get something like the code below. The code below assumes that ni+nw <= x.size(). You should probably add code to fix or prevent the case where ni+nw > x.size().

NumericVector focal_mean_or_min(NumericVector x, int threshold, size_t ni, size_t nw) {
    NumericVector out(ni);

    NumericVector temp(nw);

    size_t n = x.size();

    double min_x = min(x); // calculate this value once outside the for loop

    for (size_t i = 0; i < ni; i++) {

        // Calculate the start/stop indices of the window
        size_t start = i;
        size_t end = i + nw;
        if (end > n) {
            end = n;  // if this line is called, then the mean calculated in the line "double mean2 = mean(temp)"  will be wrong
        }

        // Iterate over the window
        for (size_t j = start; j < end; j++) {
            temp[j-start] = pow(10, x[j] / 10);
        }

        double mean2 = 10 * log10(mean(temp));

        if (mean2 > threshold) {
            out[i] = x[13];
        }
        else {
            out[i] = min_x;
        }
    }

    return out;
}

You'll notice a few of the changes I made:

  1. I changed the length of "temp" from ni to nw.
  2. I moved the "min()" calculation to outside the "for" loop.

Update:

I've added another snippet of code to reflect the goal added to the OP's question.

Note that this code just ignores ni because we want the output to be the same size as x. I also adjusted the window so that i points to the center value of the current window. If the window is at the edges of x, then the window is truncated. As a result, the length of the window is sometimes shorter than nw. This is taken into account when calculating the mean. Finally, I use a running accumulator to calculate the mean instead of creating the vectors temp and temp2. This should be faster.

NumericVector ALE2_cpp(NumericVector x, int threshold, size_t ni, size_t nw) {

    // Calculate some values outside the "for" loop so they are done once
    size_t n = x.size();
    NumericVector out(n);  // Set the output vector to the same length as the input vector "x"
    double max_double = 1e300 * 1e300; // should overflow to infinity

    // Loop over all cells in raster
    for (size_t i = 0; i < n; i++) {
        // "i" is the index of the current cell
        // "start" is the index of the start of the current window
        // "end" is the index of the end of the current window
        int start = int(i) - (int(nw) / 2); // This sets "i" to be the center of the window if "nw" is odd
        int end = start + int(nw);

        // Truncate the window instead of padding the input
        if (start < 0) {
            start = 0;
        }
        if (end > n) {
            end = int(n);
        }

        // Loop over values of the window finding the minimum and the mean
        double accumulator = 0;
        double curr_window_min = max_double;
        for (size_t j = start; j < end; j++) {

            // Search for the minimum value in the current window
            if (x[j] < curr_window_min) {
                curr_window_min = x[j];
            }

            // Keep a running total to calculate the mean later
            accumulator += pow(10, (x[j] / 10)); // convert from decibels
        }

        // Calculate the mean from the accumulator value and the current window length
        double mean_db = 10 * log10( accumulator / (end - start) );

        // Determine which value to return
        if (mean_db > threshold) {
            out[i] = x[i];           // either the original cell
        }
        else {
            out[i] = curr_window_min; // or the minimum of the current window
        }

    }
    return out;
}

Upvotes: 2

Related Questions