Reputation: 77
Data and objectives
I am trying to correct for spatial autocorrelation in a regression using paired data. More specifically, I have for each spatial polygon on a grid (200m x 200m squares) the number of individuals for two population groups (let's call them a or b, this could be for instance number of adults and children). I also have a raster for one environmental variable (var). For simplicity, I have converted both datasets to points, using the centroid of each square and assignig the corresponding number of individual and median of the variable in the considered square.
I want to test if the environmental variable varies between the groups. The idea would be to use a (probably linear) regression var ~ group with number of individual in each group as a weighting variable.
My issue
I haven't been able to write a model that can correctly take into account 1. spatial autocorrelation, 2. weights and 3. repeated (i.e., non-independant) spatial data.
What I've tried (example on toy data)
Toy data
library(sf)
library(nlme)
library(spdep)
# Create coordinates for each point in a grid
df = data.frame(
id = 1:100,
x = rep(1:10, rep = 10),
y = rep(1:10, each = 10)
)
# Create population size
df = rbind(cbind(df, data.frame(pop = rep('a', 100), n = sample.int(100, 100))),
cbind(df, data.frame(pop = rep('b', 100), n = sample.int(100, 100))))
# Create environmental variable with autocorrelation and higher in one population
df$var = df$x*df$y/10 + rnorm(100, 1, 2) + as.numeric(df$pop == 'a')
data = st_as_sf(df, coords = c("x", "y"))
data$x = df$x
data$y = df$y
Simple lm. This does not take into account non-independance or spatial correlation
mod0 = lm(var~pop, weights = n, data)
summary(mod0)
LME with spatial correlation structure. Here I am not sure the random effect is needed as the distances should take this into account,but anyways I cannot use 0 distances. Reference for weights formula: https://www.r-bloggers.com/2012/12/a-quick-note-in-weighting-with-nlme/
mod1 = lme(var~pop, random = ~ 1|id, data = data, weights = ~1/n,
correlation = corExp(form = ~ x + y))
> Error in getCovariate.corSpatial(object, data = data) :
cannot have zero distances in "corSpatial"
Spatial regression
# Find nearest neighbours
nb <- spdep::dnearneigh(data, d1 = 0, d2 = 1)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)
# Model with spatially correlated errors
serr <- spatialreg::errorsarlm(var~pop,
data = data,
listw = lw,
zero.policy = TRUE,
na.action = na.omit,
weights = n)
summary(serr)
# My issue here: non-independance between the two groups on each point
# Model with spatial lag
slag <- spatialreg::lagsarlm(var~pop,
data = data,
listw = lw,
zero.policy = TRUE,
na.action = na.omit,
weights = n)
summary(slag)
> Error in spatialreg::lagsarlm(var ~ pop, data = data, listw = lw, zero.policy = TRUE, :
unused argument (weights = n)
# My issues here: non-independance between the two groups on each point +
# cannot use weights in lagsarlm?
# The other option would be to replicate rows (one row for each individual) but I suspect this would affect significance tests.
My question What would be the correct way to structure this model, using these or other approaches for spatial correlation?
Upvotes: 0
Views: 56