Spatial regression on paired data points

Question

Data and objectives

I am trying to correct for spatial autocorrelation in a regression using paired data. More specifically, I have for each spatial polygon on a grid (200m x 200m squares) the number of individuals for two population groups (let's call them a or b, this could be for instance number of adults and children). I also have a raster for one environmental variable (var). For simplicity, I have converted both datasets to points, using the centroid of each square and assignig the corresponding number of individual and median of the variable in the considered square.

I want to test if the environmental variable varies between the groups. The idea would be to use a (probably linear) regression var ~ group with number of individual in each group as a weighting variable.

My issue

I haven't been able to write a model that can correctly take into account 1. spatial autocorrelation, 2. weights and 3. repeated (i.e., non-independant) spatial data.

What I've tried (example on toy data)

Toy data

library(sf)
library(nlme)
library(spdep)

# Create coordinates for each point in a grid
df = data.frame(
  id = 1:100,
  x = rep(1:10, rep = 10),
  y = rep(1:10, each = 10)
)

# Create population size
df = rbind(cbind(df, data.frame(pop = rep('a', 100), n = sample.int(100, 100))),
           cbind(df, data.frame(pop = rep('b', 100), n = sample.int(100, 100))))
# Create environmental variable with autocorrelation and higher in one population
df$var = df$x*df$y/10 + rnorm(100, 1, 2) + as.numeric(df$pop == 'a')

data = st_as_sf(df, coords = c("x", "y"))
data$x = df$x
data$y = df$y

Simple lm. This does not take into account non-independance or spatial correlation

mod0 = lm(var~pop, weights = n, data)
summary(mod0)

LME with spatial correlation structure. Here I am not sure the random effect is needed as the distances should take this into account,but anyways I cannot use 0 distances. Reference for weights formula: https://www.r-bloggers.com/2012/12/a-quick-note-in-weighting-with-nlme/

mod1 = lme(var~pop, random = ~ 1|id, data = data, weights = ~1/n, 
          correlation = corExp(form = ~ x + y))
> Error in getCovariate.corSpatial(object, data = data) : 
  cannot have zero distances in "corSpatial"

Spatial regression

# Find nearest neighbours
nb <- spdep::dnearneigh(data, d1 = 0, d2 = 1)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)

# Model with spatially correlated errors
serr <- spatialreg::errorsarlm(var~pop,
                                   data = data,
                                   listw = lw,
                                   zero.policy = TRUE, 
                                   na.action = na.omit,
                                   weights = n)
summary(serr)
# My issue here: non-independance between the two groups on each point

# Model with spatial lag
slag <- spatialreg::lagsarlm(var~pop,
                                   data = data,
                                   listw = lw,
                                   zero.policy = TRUE, 
                                   na.action = na.omit,
                                   weights = n)
summary(slag)
> Error in spatialreg::lagsarlm(var ~ pop, data = data, listw = lw, zero.policy = TRUE,  : 
  unused argument (weights = n)

# My issues here: non-independance between the two groups on each point +
# cannot use weights in lagsarlm?
# The other option would be to replicate rows (one row for each individual) but I suspect this would affect significance tests.

My question What would be the correct way to structure this model, using these or other approaches for spatial correlation?

Spatial regression on paired data points

Answers (0)

Related Questions