simplify shapefile for placing random points with spsample()

Question

I need to place random points on each shapefile from a list of shapefiles with spsample(). For some irregular shapefiles this is proving to be a long process so I need to simply some shapefiles by dropping small and remote polygons which (I think) are the trouble-makers for spsample().

For this I need to know for each polygon it's size and it's mean distance to all other polygons. I am looking for how to speed up this calculation, probably can be done in a more elegant (and faster) way. The attempt shown below works but as a simplifying algorithm it takes too much time.

#program tries to place random points on shapefile shapes[[i]] if it fails after 300 seconds it goes though to simplifying part and swaps the old shapefile with a simplified version.

d <- shapes[[i]]
Fdist <- list()

for(m in 1:dim(d)[1]) {
      pDist <- vector()
      for(n in 1:dim(d)[1]) { 
        pDist <- append(pDist, gDistance(d[m,],d[n,]))
      }

      Fdist[[m]] <- pDist
      d@data$mean[m]<-mean(Fdist[[m]])
      d@data$gArea[m]<-gArea(d[m,])
    } 

#drop small and remote polygons

d.1<-d[d@data$gArea>=quantile(d@data$gArea, prob = seq(0, 1, length=11), type=5)[[1]] & (d@data$mean<=quantile(d@data$mean, prob = seq(0, 1, length=11), type=5)[[10]]),]

#replace with simplified polygon

shapes[[i]]<-d.1

I would be grateful for any suggestion.

Phil · Accepted Answer

I would try simplifying the polygons first. ms_simplify in the rmapshaper package can greatly simplify your polygons, without introducing slither polygons or gaps:

library("rgdal")
library("rmapshaper")

big <- readOGR(dsn = ".", "unsimplified_shapefile")
big_sample <- spsample(big, 1000, type = "stratified")

small <- rmapshaper::ms_simplify(big, keep = 0.01)
small_sample <- spsample(small, 1000, type = "stratified")

With a shapefile I had to hand, I reduced a ~100MB shapefile to ~2MB and reduced the time taken to sample from ~2.3s to ~0.11s.

If simplifying is not an option you can vectorise your gArea() and gDistance() functions by using byid = TRUE:

library("rgeos")
big@data$area <- gArea(big, byid = TRUE)
big@data$dist <- gDistance(big, byid = TRUE)

simplify shapefile for placing random points with spsample()

Answers (1)

Related Questions