Performing spatial regressions with huge datasets without crashing the computer

Question

I'm running a spatial regression (SAR: spatial autoregressive regression) in R using the spmodel package. My dataset is a sf object formed by centroids.

This dataset has 100'781 observations, and I created a list of 3 neighborhoods for each observation.

With this information, I run the following code:

library( "spmodel" )

Model_splm <- splm( formula       =  variables, 
                    data          =  subset, 
                    listw         =  NbList, 
                    model         = "pooling",
                    lag           =  T,
                    spatial.error = "b",
                    parallel      =  T,
                    local         =  list( parallel = T )
                    )

As you can see, I'm using parallel processing to speed up the performance of this regression (my computer has 8 cores and 16 logical processors). However, I'm not able to perform this code because my computer crashes. I'm only able to run this regression when the number of observations is ~ 40'000.

Does anyone have a suggestion of how to be able to perform that regression in my computer? Or any other suggestions? This was just an example, but I need to run multiple spatial regressions, and I've been struggling with this.

Performing spatial regressions with huge datasets without crashing the computer

Answers (1)

Related Questions