user25754519
user25754519

Reputation: 1

How to successfully create raster of predicted values from spatial logistic regression? R spaMM package

I am trying to run a spatial logistic regression model using the spaMM package in R. I have a stack of three predictor rasters representing canopy height, canopy layers, and large tree density, and am trying to train a model to predict "complex forest" structure across a landscape (based on known areas with older forests vs. timber harvested areas).

I'm using the spaMM package to try and account for spatial autocorrelation, and want to ultimately have an output raster that illustrates probability of complex forest presence at each 30m raster cell. (I am able to both fit and predict the model with glm() and it works just fine; but (I believe) I can't account for spatial effects with glm, which is again why I was trying the spaMM package.) Here is a tutorial I have been referencing: https://www.r-bloggers.com/2019/09/spatial-regression-in-r-part-1-spamm-vs-glmmtmb/#google_vignette

I am able to fit the model in spaMM with the fitme() function, but keep running into an error about an issue with indexing when I try to predict across the entire extent I'm focusing on: "Error: [[(j)] the type of index j cannot be a factor or character"

Does anyone know why this error is occurring? I have tried making sure that the values in my terra raster stack of predictors are all numeric but that does not solve it.

Here is what I hope is a reproducible example with some dummy data, including comments indicating the errors I'm getting:

library(terra)
library(spaMM)

# crs we're using
crs_ref <- "EPSG:3310"

# create empty raster
empty_r <- rast(res=30, nlyr= 3, # number of layers according the number of cols of dataframe
                xmin=-251376,
                xmax=-228036,
                ymin=69372,
                ymax=138628, # can be other object extent like a shapefile, etc
                crs= crs_ref)
nrow(empty_r)
### fill the empty raster with cell values. including as.numeric to ensure values are numeric
values(empty_r$lyr.1) <- paste(as.numeric(sample(30:80, 2309, replace=T)))
values(empty_r$lyr.2) <- paste(as.numeric(sample(1:3, 2309, replace=T)))
values(empty_r$lyr.3) <- paste(as.numeric(sample(20:100, 2309, replace=T)))

### adjust the names
names(empty_r) <- c("height","layers","density")

# create the presence/absence attribute
empty_r$olderTrees <- paste(as.numeric(sample(0:1, 2309, replace=T)))
head(empty_r)

# sample 100 random points to train the model (pretending we don't need testing data)
Random_points <- terra::spatSample(empty_r, size = 100, as.points=T, values=T, method = "random")
plot(Random_points)
Random_points <- as.data.frame(Random_points, geom='XY')
head(Random_points)

# fit model with inclusion of spatial effect
mod <- fitme(olderTrees ~ height+layers+density 
             + Matern(1 | x + y), data = Random_points, family = "binomial")
# I get some warnings, assuming this is due to structure of dummy data. I don't get the
# warning when using the real dataset. 
#"Warning messages:
#1: In .qr.rank.def.warn(r) :
#  matrix is structurally rank deficient; using augmented matrix with additional 7 row(s) of zeros
#2: spaMM_glm.fit: fitted probabilities numerically 0 or 1 occurred 

summary(mod)

predictTest <- predict(mod, newdata=empty_r)
# Error: [`[`(j)] the type of index j cannot be a factor or character

# try adding x and y attributes to the raster stack in case that's the issue
empty_r$x <- crds(empty_r)[,1]
empty_r$y <- crds(empty_r)[,2]

predictTest2 <- predict(mod, allow.new.levels=T, newdata=empty_r)
# same error as above about index j

# and trying to turn the raster into a df
empty_r_df <- as.data.frame(empty_r)

predictTest3 <- predict(mod, newdata=empty_r_df)
# Error in model.frame.default(Terms, data, xlev = .get_from_terms_info(object = fitobject,  : 
# factor SaloCC_CanopyHeightClip has new levels 35, 42, 59, 60, 72, 75, 78
# add allow.new.levels=T??

predictTest3 <- predict(mod, newdata=empty_r_df, allow.new.levels=T)
# Same error about new levels

Upvotes: 0

Views: 112

Answers (1)

sebdunnett
sebdunnett

Reputation: 31

Removing the paste function from your value generations, i.e. values(empty_r$lyr.1) <- paste(as.numeric(sample(30:80, 2309, replace=T))) --> values(empty_r$lyr.1) <- as.numeric(sample(30:80, 2309, replace=T)), allows the third prediction (using a df) to run for me. Including the paste function makes them characters and the model treats them as categorical variables.

Upvotes: 0

Related Questions