yrx1702
yrx1702

Reputation: 1651

Streamlining binary rasterization in R

I have a few very small country-level polygon and point shapefiles that I would like to rasterize in R. The final product should be one global binary raster (indicating whether grid cell center is covered by a polygon / point lies within cell or not). My approach is to loop over the shapefiles and do the following for each shapefile:

  # load shapefile  
  shp       = sf::read_sf(shapefile_path)
  
  # create a global raster template with resolution 0.0083
  ext       = extent(-180.0042, 180.0042, -65.00417, 75.00417)
  gridsize  = 0.008333333
  r         = raster(ext, res = gridsize)
  
  # rasterize polygon or point shapefile to raster
  rr        = rasterize(shp, r, background = 0) #all grid cells that are not covered get 0

  # convert to binary raster
  values(rr)[values(rr)>0] = 1

Here, rr is the raster file where the polygons / points in shp are coded as 1 and all other grid cells are coded as 0. Afterwards, I take the sum over all rr to arrive at one global binary raster file including all polygons / points.

The final two steps are incredibly slow. In addition, I get RAM problems when I try to replace the all positive values in rr with 1 as the cell count is very large due to the fine resolution. I was wondering whether it is possible to come up with a smarter solution for what I'd like to achieve.

I have already found the fasterize package that has a speedy implementation of rasterize which works fine. I think it would be of great help if someone has a solution where rasterize directly returns a binary raster.

Upvotes: 0

Views: 989

Answers (2)

Robert Hijmans
Robert Hijmans

Reputation: 47591

This is how you can do this better with raster. Note the value=1 argument, and also that that I changed your specification of the extent -- as what you do is probably not correct.

library(raster)
v <- shapefile(shapefile_path)
ext <- extent(-180, 180, -65, 75)
r <- raster(ext, res = 1/120)
rr <- rasterize(v, r, value=1, background = 0)

There is no need for your last step, but you could have done

rr <- clamp(rr, 0, 1)
# or 
rr <- rr > 0
# or
rr <- reclassify(rr, cbind(1, Inf, 1))

raster::calc is not very efficient for simple arithmetic like this

It should be much faster to rasterize all vector data in one step, rather than in a loop, especially with large rasters like this (for which the program may need to write a temp file for each iteration).

To illustrate this solution with example data

library(raster)
cds1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60))
cds2 <- rbind(c(-10,0), c(140,60), c(160,0), c(140,-55))
cds3 <- rbind(c(-125,0), c(0,60), c(40,5), c(15,-45))
v <- spLines(cds1, cds2, cds3)   
r <- raster(ncols=90, nrows=45)

r <- rasterize(v, r, field=1)

To speed things up, you can use terra (the replacement for raster)

library(raster)
f <- system.file("ex/lux.shp", package="terra")
v <- as.lines(vect(f))
r <- rast(v, ncol=75, nrow=100)
x <- rasterize(v, r, field=1)  

Upvotes: 1

yrx1702
yrx1702

Reputation: 1651

Something that seems to work computationally and significantly improves computation time is to

  1. Create one large shapefile shp instead of working with individual rasterized shapefiles.

  2. Use the fasterize package to rasterize the merged shapefile.

  3. Use raster::calc to avoid memory problems.

 ext       = extent(-180.0042, 180.0042, -65.00417, 75.00417)
 gridsize  = 0.008333333
 r         = raster(ext, res=gridsize)
 rr        = fasterize(shp, r, background = 0) #all not covered cells get 0, others get sum

# convert to binary raster
  fun       =  function(x) {x[x>0] <- 1; return(x) }
  r2        =  raster::calc(rr, fun)

Upvotes: 0

Related Questions