上官宇霖
上官宇霖

Reputation: 1

parallelizing pixel-wise regression in R

I'm not familiar with R, and I want to speed up calculation while doing pixel-wise regression over two large datasets(abot 4GB each) in R, but I got the error Error in clusterR(gim_mod, calc, args = list(fun = coeff)) : cluster error.

Can anyone tell me what's wrong in my code and help me out. here are my codes that got an error:

gim_mod <- stack(gimms_dis_re,modis_re)
coeff <- function(x){
  if (is.na(x[1])){
    NA
  }
  else {
    lm(x[1:156] ~ x[157:312])$coefficients
  }
}
beginCluster(n = 5)
coef_gm <- clusterR(gim_mod,calc, args = list(fun = coeff))
endCluster()

the gimms_dis_re and modis_re are two Rasterstacks that each contains 156 Rasterlayers, and I want to do pixel-wise regression over them.

Upvotes: 0

Views: 96

Answers (1)

Robert Hijmans
Robert Hijmans

Reputation: 47146

The function used in calc should return the same number of values for each cell. Your function returns an NA when there is only one number; but two values when there is not.

The below works for me (minimal example data).

Example data

library(raster)
r <- raster(nrow=10, ncol=10)
set.seed(321)
s1 <- lapply(1:12, function(i) setValues(r, rnorm(ncell(r), i, 3)))
s2 <- lapply(1:12, function(i) setValues(r, rnorm(ncell(r), i, 3)))
s1 <- stack(s1)
s2 <- stack(s2)
s1[1:5] = NA

Regression of values in one RasterStack with another

s <- stack(s1, s2)
fun <- function(x) {
    if (is.na(x[1])) {
        c(NA, NA)
    } else {
        lm(x[1:12] ~ x[13:24])$coefficients 
    }
}

# works without cluster
x <- calc(s, fun)

# and with cluster
beginCluster(n = 2)
g <- clusterR(s, calc, args = list(fun = fun))
endCluster()

Upvotes: 1

Related Questions