Nausi
Nausi

Reputation: 77

Spatial regression on paired data points

Data and objectives

I am trying to correct for spatial autocorrelation in a regression using paired data. More specifically, I have for each spatial polygon on a grid (200m x 200m squares) the number of individuals for two population groups (let's call them a or b, this could be for instance number of adults and children). I also have a raster for one environmental variable (var). For simplicity, I have converted both datasets to points, using the centroid of each square and assignig the corresponding number of individual and median of the variable in the considered square.

I want to test if the environmental variable varies between the groups. The idea would be to use a (probably linear) regression var ~ group with number of individual in each group as a weighting variable.

My issue

I haven't been able to write a model that can correctly take into account 1. spatial autocorrelation, 2. weights and 3. repeated (i.e., non-independant) spatial data.

What I've tried (example on toy data)

Toy data

library(sf)
library(nlme)
library(spdep)

# Create coordinates for each point in a grid
df = data.frame(
  id = 1:100,
  x = rep(1:10, rep = 10),
  y = rep(1:10, each = 10)
)

# Create population size
df = rbind(cbind(df, data.frame(pop = rep('a', 100), n = sample.int(100, 100))),
           cbind(df, data.frame(pop = rep('b', 100), n = sample.int(100, 100))))
# Create environmental variable with autocorrelation and higher in one population
df$var = df$x*df$y/10 + rnorm(100, 1, 2) + as.numeric(df$pop == 'a')

data = st_as_sf(df, coords = c("x", "y"))
data$x = df$x
data$y = df$y

Simple lm. This does not take into account non-independance or spatial correlation

mod0 = lm(var~pop, weights = n, data)
summary(mod0)

LME with spatial correlation structure. Here I am not sure the random effect is needed as the distances should take this into account,but anyways I cannot use 0 distances. Reference for weights formula: https://www.r-bloggers.com/2012/12/a-quick-note-in-weighting-with-nlme/

mod1 = lme(var~pop, random = ~ 1|id, data = data, weights = ~1/n, 
          correlation = corExp(form = ~ x + y))
> Error in getCovariate.corSpatial(object, data = data) : 
  cannot have zero distances in "corSpatial"

Spatial regression

# Find nearest neighbours
nb <- spdep::dnearneigh(data, d1 = 0, d2 = 1)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)

# Model with spatially correlated errors
serr <- spatialreg::errorsarlm(var~pop,
                                   data = data,
                                   listw = lw,
                                   zero.policy = TRUE, 
                                   na.action = na.omit,
                                   weights = n)
summary(serr)
# My issue here: non-independance between the two groups on each point

# Model with spatial lag
slag <- spatialreg::lagsarlm(var~pop,
                                   data = data,
                                   listw = lw,
                                   zero.policy = TRUE, 
                                   na.action = na.omit,
                                   weights = n)
summary(slag)
> Error in spatialreg::lagsarlm(var ~ pop, data = data, listw = lw, zero.policy = TRUE,  : 
  unused argument (weights = n)

# My issues here: non-independance between the two groups on each point +
# cannot use weights in lagsarlm?
# The other option would be to replicate rows (one row for each individual) but I suspect this would affect significance tests.

My question What would be the correct way to structure this model, using these or other approaches for spatial correlation?

Upvotes: 0

Views: 56

Answers (0)

Related Questions