Reputation: 3805

fill NA raster cells using focal defined by boundary

I have a raster and a shapefile. The raster contains NA and I am filling the NAs using the focal function

library(terra)

v <- vect(system.file("ex/lux.shp", package="terra"))
r <- rast(system.file("ex/elev.tif", package="terra"))
r[45:60, 45:60] <- NA
  
r_fill <- terra::focal(r, 5, mean, na.policy="only", na.rm=TRUE)

However, there are some NA still left. So I do this:

na_count <- terra::freq(r_fill, value = NA)

while(na_count$count != 0){
  
 r_fill <- terra::focal(r_fill, 5, mean, na.policy="only", na.rm=TRUE)  
 na_count <- terra::freq(r_fill, value = NA)
}

Once all NA's are filled, I clip the raster again using the shapefile

r_fill  <- terra::crop(r_fill, v, mask = T, touches = T)

This is what my before and after looks like:

I wondered if the while loop is an efficient way to fill the NAs or basically determine how many times I have to run focal to fill all the NAs in the raster.

Upvotes: 2

Answers (1)

Chris

Reputation: 2286

Perhaps we can, or want to, dispense with the while( altogether by making a better estimate of focal('s w= arg in a world where r, as ground truth, isn't available. Were it available, we could readily derive direct value of w

r <- rast(system.file("ex/elev.tif", package="terra"))
# and it's variants
r2 <- r
r2[45:60, 45:60] <- NA
freq(r2, value=NA) - freq(r, value=NA)
  layer value count
1     0    NA   256
sqrt((freq(r2, value=NA) - freq(r, value=NA))$count)
[1] 16

which might be a good value for w=, and introducing another variant

r3 <- r
r3[40:47, 40:47] <- NA
r3[60:67, 60:67] <- NA
r3[30:37, 30:37] <- NA
r3[70:77, 40:47] <- NA
rm(r)

We no longer have our ground truth. How might we estimate an edge of w=? Turning to boundaries( default values (inner)

r2_bi <- boundaries(r2)
r3_bi <- boundaries(r3)
# examining some properties of r2_bi, r3_bi
freq(r2_bi, value=1)$count
[1] 503
freq(r3_bi, value=1)$count
[1] 579
freq(r2_bi, value=1)$count/freq(r2_bi, value = 0)$count
[1] 0.1306833
freq(r3_bi, value=1)$count/freq(r3_bi, value = 0)$count
[1] 0.1534588
sum(freq(r2_bi, value=1)$count,freq(r2_bi, value = 0)$count)
[1] 4352
sum(freq(r3_bi, value=1)$count,freq(r3_bi, value = 0)$count)
[1] 4352

Taken in reverse order, sum[s] and freq[s] suggest that while the total area of (let's call them holes) are the same, they differ in number and r2 is generally larger than r3. This is also clear from the first pair of freq[s]. Now we drift into some voodoo, hocus pocus in pursuit of a better edge estimate

    sum(freq(r2)$count) - sum(freq(r2, value = NA)$count)
    [1] 154
    sum(freq(r3)$count) - sum(freq(r3, value = NA)$count)
    [1] 154
    (sum(freq(r3)$count) - sum(freq(r3, value = NA)$count))
    [1] 12.40967
    freq(r2_bi, value=1)$count/freq(r2_bi, value = 0)$count
    [1] 0.1306833
    freq(r2_bi, value=0)$count/freq(r2_bi, value = 1)$count
    [1] 7.652087
    freq(r3_bi, value=1)$count/freq(r3_bi, value = 0)$count
    [1] 0.1534588
    taking the larger, i.e. freq(r2_bi 7.052087
    7.652087/0.1306833
    [1] 58.55444
    154+58
    [1] 212
    sqrt(212)
    [1] 14.56022
    round(sqrt(212)+1)
    [1] 16

Well, except for that +1 part, maybe still a decent estimate for w=, to be used on both r2 and r3 if called upon to find a better w, and perhaps obviate the need for while(. Another approach to looking for squares and their edges:

wtf3 <- values(r3_bi$elevation)
wtf2 <- values(r2_bi$elevation)
wtf2_tbl_df2 <-   as.data.frame(table(rle(as.vector(is.na(wtf2)))$lengths))
wtf3_tbl_df2 <- as.data.frame(table(rle(as.vector(is.na(wtf3)))$lengths))
names(wtf2_tbl_df2)
[1] "Var1" "Freq"
wtf2_tbl_df2[which(wtf2_tbl_df2$Var1 == wtf2_tbl_df2$Freq), ]
   Var1 Freq
14   16   16
wtf3_tbl_df2[which(wtf3_tbl_df2$Freq == max(wtf3_tbl_df2$Freq)), ]
  Var1 Freq
7    8   35
35/8
[1] 4.375 # 4 squares of 8 with 3 8 length vectors

bringing in v finally and filling

v <- vect(system.file("ex/lux.shp", package="terra"))
r2_fill_17 <- focal(r2, 16 + 1 , mean, na.policy='only', na.rm = TRUE)
r3_fill_9 <- focal(r3, 8 + 1 , mean, na.policy='only', na.rm = TRUE)
r2_fill_17_cropv <- crop(r2_fill_17, v, mask = TRUE, touches = TRUE)
r3_fill_9_cropv <- crop(r3_fill_9, v, mask = TRUE, touches = TRUE)

And I now appreciate your while( approach as your r2 looks better, more naturally transitioned, though the r3 looks fine. In my few, brief experiments with smaller than 'hole', i.e. focal(r2, 9, I got the sense it would take 2 passes to fill, that suggests focal(r2, 5 would take 4.

I guess further determining the proportion of fill:hole:rast for when to deploy a while would be worthwhile.

Upvotes: 1

fill NA raster cells using focal defined by boundary

Answers (1)

Related Questions