Reputation: 61
I hope this is not too trivial but I really can't find an answer and I'm too new to the topic to come up with alternatives myself. So here is the Problem:
I have two shapefiles x and y that represent different processing levels of a Sentinel2 satellite image.
x contains about 1.300.000 polygons/Segments completely covering the image extend without any further vital information.
y contains about 500 polygons representing the cloud-free area of the image (also covering most of the image except for a few "cloud-holes") as well as information about the used image in 4 columns (Sensor, Time...)
I'm trying to add the image information to x in places x is covered by y. pretty simple? I just can't find a way to make it happen without taking days.
I read x in as a simple feature {sf}, as reading it with shapefile / readOGR takes ages. I tried different things with y
when I try merge(x,y) I can only take one sf as merge doesn't support two sf's. merging x (as sf) and y (as shp) gives me the error "cannot allocate vector of size 13.0 Gb"
so I tried sf::st_join(x,y)
, which supports both Variables to be sf but still didn't finish for 28 hours now
sf::st_intersect(x,y)
took about 9 minutes for a 10.000 segment subset, so that might not be a lot faster for the whole piece.
could subsetting x to a few smaller pieces solve the whole thing or is there another simple solution? could I do something with my workspace to make the merge work or is there simply no shortcut to joining that amount of polygons?
Thanks a lot in advance and I hope my description isn't too fuzzy!
my tiny work station:
win 7 64 bit
8 GB RAM
intel i7-4790 @ 3,6 GHz
Upvotes: 6
Views: 1239
Reputation: 1360
I often face this kind of problems and as @manotheshark2 afirms, I prefer to work in a loop subseting my vector layer. Here is my advice:
Load your data
library(raster)
library(rgdal)
x <- readOGR('C:/', 'sentinelCovers')
y <- readOGR('C:/', 'cloudHoles')
Assign an y ID for identify which x polygons intersects y polygons and create the column in x table
x$xyID <- NA # Answer col
y$yID <- 1:nrow(y@data) # ID col
Run a loop subseting x
for (posX in 1:nrow(x@data)){
pol.x <- x[posX, ]
intX <- raster::intersect(pol.x, y)
# x$xyID[posX] <- intX@data$yID ## Run this if there's unique y polygons
# x$xyID[posX] <- paste0(intX@data$yID, collapse = ',') ## Run this if there's multiple y polygons
}
You can check if is better to run the loop on x o y layer
x$xyID <- NA # Answer col
x$xID <- 1:nrow(x@data) # ID Col
for (posY in 1:nrow(y@data)){
pol.y <- y[posY, ]
intY <- tryCatch(raster::intersect(pol.y, x), finally = 'NULL')
if (is.null(intY)) next
x$xyID[x@data$xID %in% intY@data$xID] <- pol.y$yID
}
Upvotes: 2