Reputation: 7755
What is the quickest way of cropping (clipping) a complex SpatialPolygonsDataFrame with the @data
preserved in R using another, possibly complex, SpatialPolygon? I know of two ways (shown under). The raster
way is quicker for less complex SpatialPolygonsDataFrames and returns a SpatialPolygonsDataFrame as in the example. It gets slow with large SpatialPolygonsDataFrames, though. The rgeos
way is quicker for large SpatialPolygonsDataFrame, but it sometimes fails with very complex geometries and does not return SpatialPolygonsDataFrames as default.
I have not paid attention to the GIS development in R lately and there might well be quicker ways of doing this by now.
The example polygons are small to honor people's computers and bandwidth. Consider the "real" polygons 50-1000 MB.
Setup:
library(rnaturalearth)
library(sp)
library(raster)
library(rgeos)
dt <- rnaturalearth::ne_countries()
clip_boundary <- sp::SpatialPolygons(list(sp::Polygons(
list(sp::Polygon(data.frame(lon = c(0, 180, 180, 0), lat = c(40, 40, 80, 80)))), ID = 1)))
The rgeos
way:
system.time({
clipped.dt <- rgeos::gIntersection(dt, clip_boundary, byid = TRUE)
ids <- sapply(slot(clipped.dt, "polygons"), function(x) slot(x, "ID"))
ids <- sapply(strsplit(ids, " "), "[", 1)
tmp.df <- data.frame(dt@data[ids,])
names(tmp.df) <- names(dt@data)
out <- sp::SpatialPolygonsDataFrame(clipped.dt, tmp.df, match.ID = FALSE)
})
# user system elapsed
# 0.069 0.002 0.074
class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"
The raster
way:
system.time({
out <- raster::crop(dt, clip_boundary)
})
# user system elapsed
# 0.042 0.001 0.043
class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"
A plot for a reference (not relevant for the question):
plot(out)
Upvotes: 3
Views: 1771
Reputation: 5489
Much of R's geospatial work can now be done using the sf
package.
https://r-spatial.github.io/sf/index.html
It looks like it might be a touch faster than raster and sp.
library(rnaturalearth)
library(sp)
#library(raster)
#library(rgeos)
library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(ggplot2)
# Original data
dt <- rnaturalearth::ne_countries()
clip_boundary <- sp::SpatialPolygons(list(sp::Polygons(
list(sp::Polygon(data.frame(lon = c(0, 180, 180, 0), lat = c(40, 40, 80, 80)))), ID = 1)))
# Original data as sf objects
dt_sf <- st_as_sf(dt)
boundary_sf <- st_as_sf(clip_boundary)
# clip
clipped <- st_crop(dt_sf, boundary_sf)
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries
#plot
ggplot(clipped) + geom_sf()
system.time({
out <- st_crop(dt_sf, boundary_sf)
})
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries
#> user system elapsed
#> 0.019 0.000 0.018
system.time({
out <- raster::crop(dt, clip_boundary)
})
#> Warning in raster::crop(dt, clip_boundary): non identical CRS
#> user system elapsed
#> 0.526 0.000 0.525
system.time({
clipped.dt <- rgeos::gIntersection(dt, clip_boundary, byid = TRUE)
ids <- sapply(slot(clipped.dt, "polygons"), function(x) slot(x, "ID"))
ids <- sapply(strsplit(ids, " "), "[", 1)
tmp.df <- data.frame(dt@data[ids,])
names(tmp.df) <- names(dt@data)
out <- sp::SpatialPolygonsDataFrame(clipped.dt, tmp.df, match.ID = FALSE)
})
#> Warning in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td,
#> unaryUnion_if_byid_false, : spgeom1 and spgeom2 have different proj4 strings
#> user system elapsed
#> 0.045 0.000 0.045
Created on 2020-04-13 by the reprex package (v0.3.0)
Upvotes: 3