Reputation: 21
I am trying to create a spatial survey design by first putting a grid over a study site and then sampling randomly within each grid cells. Here is a made-up example that creates a grid of hexagons over the state of Montana. The problem is that st_sample
provides random points over the entire shape, not within the grid. I tried the other types
within st_sample
and they do not work. I suspect that st_sample
doesn't see my grid as multiple polygons?
Any ideas, either using these packages/functions or others in R?
States <- raster::getData("GADM", country = "United States", level = 1)
Montana <- States[States$NAME_1 == "Montana",]
MT <- sf::st_as_sf(Montana)
grid <- MT %>%
st_make_grid(cellsize = 1, what = "polygons", square = FALSE) %>%
st_intersection(MT)
set.seed(51)
pts_sf <- st_sample(grid, size = 78, type = 'random')
ggplot() +
geom_sf(data = MT) +
geom_sf(data = grid) +
geom_sf(data = pts_sf)
Upvotes: 2
Views: 560
Reputation: 1413
?st_sample
revealed there is a by_polygon = TRUE
parameter for MULTIPOLYGON geometries which sounded very promising but this can't be that easy, right?
This is how I modified your code, basically a st_combine
call followed by st_as_sf
on your grid in order to obtain a simple feature collection of type MULTIPOLYGON:
States <- raster::getData("GADM", country = "United States", level = 1)
Montana <- States[States$NAME_1 == "Montana",]
MT <- sf::st_as_sfc(Montana)
grid <- MT |>
sf::st_make_grid(cellsize = 1, what = "polygon", square = FALSE) |>
sf::st_intersection(MT) |>
sf::st_as_sf()
grid_mp <- sf::st_combine(grid) |> sf::st_as_sf()
set.seed(51)
pts_sf <- sf::st_sample(grid_mp, size = dim(grid)[1], type = "random", by_polygon = TRUE)
... leading to the following error message:
Error in s2_geography_from_wkb(x, oriented = oriented, check = check) :
Evaluation error: Found 1 feature with invalid spherical geometry.
[1] Loop 44 edge 5 has duplicate near loop 47 edge 3.
So I wrote a little function in order to understand what is happening here (don't hate me for this inelegant approach):
st_sample_per_feature <- function(data) {
n <- dim(data)[1]
list <- list()
for (i in 1:n) {
list[[i]] <- sf::st_sample(data[[1]][i], size = 1, type = "random") |> sf::st_as_sf()
}
dplyr::bind_rows(list)
}
pts_full <- st_sample_per_feature(grid)
pts_full
#> Simple feature collection with 78 features and 0 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -116.0416 ymin: 44.55168 xmax: -104.0458 ymax: 48.99927
#> CRS: +proj=longlat +datum=WGS84
#> First 10 features:
#> x
#> 1 POINT (-115.6625 47.71146)
#> 2 POINT (-116.0416 48.99488)
#> 3 POINT (-115.3728 47.32979)
#> 4 POINT (-115.8625 48.48905)
#> 5 POINT (-114.5906 47.82194)
The output looks quite promising on the whole but the plot shows clear patterns along the border:
ggplot2::ggplot() +
ggplot2::geom_sf(data = MT) +
ggplot2::geom_sf(data = grid, color = "red", fill = ggplot2::alpha("grey", 0.2)) +
ggplot2::geom_sf(data = pts_full)
That's when I started to count the grid cells manually. 62 of them are visible but grid
consists of 78 objects. st_area(grid) |> sort(decreasing = TRUE)
shows there are exact 16 polygon objects in grid
having a significantly smaller area than the rest.
Unfortunately, this is not a solution to your problem at this point but I strongly assume st_sample(x, by_polygon = TRUE)
fails in the first place because of these artifacts resulting from the way you call st_make_grid()
.
So basically it seems like the problem here is not the sampling-per-polygon but the definition of a suitable and geometrically correct (?) grid.
Upvotes: 1