Mikko
Mikko

Reputation: 7755

Quickest way to crop/clip a complex SpatialPolygonsDataFrame using another polygon

What is the quickest way of cropping (clipping) a complex SpatialPolygonsDataFrame with the @data preserved in R using another, possibly complex, SpatialPolygon? I know of two ways (shown under). The raster way is quicker for less complex SpatialPolygonsDataFrames and returns a SpatialPolygonsDataFrame as in the example. It gets slow with large SpatialPolygonsDataFrames, though. The rgeos way is quicker for large SpatialPolygonsDataFrame, but it sometimes fails with very complex geometries and does not return SpatialPolygonsDataFrames as default.

I have not paid attention to the GIS development in R lately and there might well be quicker ways of doing this by now.

The example polygons are small to honor people's computers and bandwidth. Consider the "real" polygons 50-1000 MB.

Setup:

library(rnaturalearth)
library(sp)
library(raster)
library(rgeos)

dt <- rnaturalearth::ne_countries()
clip_boundary <- sp::SpatialPolygons(list(sp::Polygons(
list(sp::Polygon(data.frame(lon = c(0, 180, 180, 0), lat = c(40, 40, 80, 80)))), ID = 1)))

The rgeos way:

system.time({
clipped.dt <- rgeos::gIntersection(dt, clip_boundary, byid = TRUE)
ids <- sapply(slot(clipped.dt, "polygons"), function(x) slot(x, "ID"))
ids <- sapply(strsplit(ids, " "), "[", 1) 

tmp.df <- data.frame(dt@data[ids,])
names(tmp.df) <- names(dt@data)

out <- sp::SpatialPolygonsDataFrame(clipped.dt, tmp.df, match.ID = FALSE)
})

#   user  system elapsed 
#  0.069   0.002   0.074 

class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"

The raster way:

system.time({
  out <- raster::crop(dt, clip_boundary)
})

#   user  system elapsed 
#  0.042   0.001   0.043 

class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"

A plot for a reference (not relevant for the question):

plot(out)

enter image description here

Upvotes: 3

Views: 1771

Answers (1)

mrhellmann
mrhellmann

Reputation: 5489

Much of R's geospatial work can now be done using the sf package.

https://r-spatial.github.io/sf/index.html

It looks like it might be a touch faster than raster and sp.

library(rnaturalearth)
library(sp)
#library(raster)
#library(rgeos)
library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(ggplot2)

# Original data 
dt <- rnaturalearth::ne_countries()
clip_boundary <- sp::SpatialPolygons(list(sp::Polygons(
  list(sp::Polygon(data.frame(lon = c(0, 180, 180, 0), lat = c(40, 40, 80, 80)))), ID = 1)))

# Original data as sf objects
dt_sf <- st_as_sf(dt)
boundary_sf <- st_as_sf(clip_boundary)

# clip 
clipped <- st_crop(dt_sf, boundary_sf)
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries

#plot
ggplot(clipped) + geom_sf()


system.time({
  out <- st_crop(dt_sf, boundary_sf)
})
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries
#>    user  system elapsed 
#>   0.019   0.000   0.018

system.time({
  out <- raster::crop(dt, clip_boundary)
})
#> Warning in raster::crop(dt, clip_boundary): non identical CRS
#>    user  system elapsed 
#>   0.526   0.000   0.525

system.time({
  clipped.dt <- rgeos::gIntersection(dt, clip_boundary, byid = TRUE)
  ids <- sapply(slot(clipped.dt, "polygons"), function(x) slot(x, "ID"))
  ids <- sapply(strsplit(ids, " "), "[", 1) 

  tmp.df <- data.frame(dt@data[ids,])
  names(tmp.df) <- names(dt@data)

  out <- sp::SpatialPolygonsDataFrame(clipped.dt, tmp.df, match.ID = FALSE)
})
#> Warning in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td,
#> unaryUnion_if_byid_false, : spgeom1 and spgeom2 have different proj4 strings
#>    user  system elapsed 
#>   0.045   0.000   0.045

Created on 2020-04-13 by the reprex package (v0.3.0)

Upvotes: 3

Related Questions