Reputation: 11
I have a raster in R, I need to select the highest cell values up until 30% of the raster area is selected.
The way that I've tried to accomplish this is by calculating the average cell area, and then calculating how many cells I need to meet this 30% target (I know this is not entirely accurate). Then I sort the raster values, descending. Here is where I'm stuck. Of these sorted values, I need to set all cells beyond #12,678 to NA. I can't figure out how to set values to NA based on their place in an order. Does anyone know how to do this? Or have a better idea for the entire process?
Upvotes: 1
Views: 719
Reputation: 47061
If you assume a constant cell size, then you might as well ignore it and you can do something like this (find the quantile of interest and use that as a threshold):
library(terra)
r <- rast(system.file("ex/elev.tif", package = "terra"))
q <- global(r, \(i) quantile(i, 0.7, na.rm=T))
x <- ifel(r <= q[[1]], NA, r)
plot(x)
Check
g <- global(c(x,r), "notNA")
g[1,1]/g[2,1]
# 0.2979601
If variation in cell size is important, you could do something like the below (sort the values, and find the cells below the cumulative cell size threshold), but note that this approach is not memory-safe (would fail with very large rasters)
s <- c(r, cellSize(r, unit="km"))
d <- as.data.frame(s, cell=T, na.rm=T)
d <- d[order(d$elevation, decreasing=TRUE), ]
d$sumarea <- cumsum(d$area) / sum(d$area)
cells <- d[d$sumarea <= 0.3, "cell"]
msk <- rast(r)
msk[cells] <- 1
x <- mask(r, msk)
plot(x)
Upvotes: 0
Reputation: 3402
you can use tidyterra
for this.
Once that you have identified your thresold (12,678) use tidyterra::filter()
on your raster.
I don't know if you are using raster
or terra
, I use here terra
and I show you how to convert back to Raster*
object.
On this example I present a full workflow with an example raster file, just adapt it to your data.
library(terra)
#> terra 1.5.21
library(tidyterra)
#> -- Attaching packages --------------------------------------- tidyterra 0.1.0 --
#>
#> Suppress this startup message by setting Sys.setenv(tidyterra.quiet = TRUE)
#> v tibble 3.1.7 v dplyr 1.0.9
#> v tidyr 1.2.0
# Create a SpatRaster from a file
f <- system.file("ex/elev.tif", package = "terra")
r <- rast(f)
# If you are using raster package use this line for switch it to terra:
# r <- rast(your_data)
r
#> class : SpatRaster
#> dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#> resolution : 0.008333333, 0.008333333 (x, y)
#> extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
#> source : elev.tif
#> name : elevation
#> min value : 141
#> max value : 547
totarea_km <- expanse(r, unit = "km")
prettyNum(totarea_km, big.mark = ",")
#> [1] "2,563.61"
plot(r)
# There is a bug on slice_max, use filter
# Get thresold for filtering by working on the values of the raster
# as it is a data frame
min_value <- as_tibble(r, na.rm = TRUE) %>%
slice_max(order_by = elevation, prop = .3) %>%
min()
# Here it goes!!
top30perc <- r %>% filter(elevation > min_value)
# Check
area_30perc <- expanse(top30perc, unit = "km")
prettyNum(area_30perc, big.mark = ",")
#> [1] "761.0991"
# Check
area_30perc / totarea_km
#> [1] 0.2968857
# Seems ok
plot(top30perc)
# If you need to convert to Raster* object
raster::stack(top30perc)
#> class : RasterStack
#> dimensions : 90, 95, 8550, 1 (nrow, ncol, ncell, nlayers)
#> resolution : 0.008333333, 0.008333333 (x, y)
#> extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#> crs : +proj=longlat +datum=WGS84 +no_defs
#> names : elevation
#> min values : 387
#> max values : 547
Created on 2022-05-31 by the reprex package (v2.0.1)
Upvotes: 0