Reputation: 31452
I have some large raster bricks of gridded global data. I want to perform some computationally demanding calculations on a cell by cell basis. To reduce computation load I want to run the model only on raster cells that contain data in at least one of the layers of the brick. But how do I efficiently find cells that contain some data? I know I can do it with a loop. like this:
1st some reproducible data:
library(raster)
r.list = vector("list", 20)
set.seed(123)
for (i in 1:20) {
r.list[[i]] = raster(vals = sample(c(rep(NA,10),1), 100, TRUE), nrows = 10, ncols = 10, ext = extent(c(0,25,0,25)))
}
r.brick = brick(r.list)
Now we loop through each layer of the brick to find which cells have some data:
has.data.list = vector("list", 20)
for (i in 1:20) {
has.data.list[[i]] = which(!is.na(values(raster(r.brick, layer=i))))
}
has.data = sort(unique(unlist(has.data.list)))
But this is rather inelegant. Is there a canonical / efficient way to get a vector of cells that contain some data?
Upvotes: 1
Views: 580
Reputation: 5932
A possible trick could be to exploit the fact that max()
will only return NA
when all its input values are NA
if you specify na.rm = TRUE.
Therefore, you could compute the max of the raster brick, and check where it is different than NA. Something like:
has.data = which(!is.na(getValues(max(r.brick, na.rm=TRUE))))
has.data
1 2 4 5 6 7 8 9 10 11 12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 33 34 35 37 38 39 40 41 42 44 46 47 48 49 50 51 52 53 54 55 56 57 59 60 61 62 64 66 67 68 69 70 71 74 75 76 77 78 79 80 82 83 85 86 87 89 90 91 92 93 95 96 97 98 99 100
Upvotes: 1