happymappy
happymappy

Reputation: 181

Working nested for-loops into R code to reduce redundancy

EDIT: The rasters were acquired from the National Weather Service (https://water.weather.gov/precip/download.php). My county shapefiles are for Kentucky and could be acquired from Esri's ArcGIS online data services.

I have 5 rasters that contain precipitation data for various periods of records: daily, previous 30 days, 60 days, 90 days, and 365 days. I want to perform zonal statistics to calculate the mean value of each raster that falls within a shapefile that represents a county boundary polygon. I have 120 county polygons.

I have code that works, but I'd like to refine it. Currently, it involves a lot of copying and pasting of the same code.

library(rgdal)
library(raster)
library(sf)
library(maptools)

# County shapefiles
input_path <- "C:/path/to/shapefiles/GIS_data/Counties"
files <- list.files(input_path, pattern="[.]shp$", full.names=TRUE)
allShapes <- lapply(files, readOGR)
filenames <- list.files(input_path, pattern="[.]shp$")

# Rasters
NWS1_daily <- raster(paste(daily_downloads, daily_fname, sep='/'), band = 1)
NWS1_prev30 <- raster(paste(bulk_downloads, prev30_fname, sep='/'), band = 1)
NWS1_prev90 <- raster(paste(bulk_downloads, prev90_fname, sep='/'), band = 1)
NWS1_prev120 <- raster(paste(bulk_downloads, prev120_fname, sep='/'), band = 1)
NWS1_prev365 <- raster(paste(bulk_downloads, prev365_fname, sep='/'), band = 1)

# Perform zonal statistics to acquire mean observation for each county
observations <- vector()
filenames <- vector()
for (i in 1:length(allShapes)){
  observations[i] <- c(extract(NWS1_daily, allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)

observations <- vector()
filenames <- vector()
for (i in 1:length(allShapes)){
  observations[i] <- c(extract(NWS1_prev30, allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)

observations <- vector()
filenames <- vector()
for (i in 1:length(allShapes)){
  observations[i] <- c(extract(NWS1_prev90, allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)

observations <- vector()
filenames <- vector()
for (i in 1:length(allShapes)){
  observations[i] <- c(extract(NWS1_prev120, allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)

observations <- vector()
filenames <- vector()
for (i in 1:length(allShapes)){
  observations[i] <- c(extract(NWS1_prev365, allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)


How would I go about placing a nested for loop into my existing code to tackle some of the redundancy?

I've been trying to do this using some online resources as a reference, but haven't been able to get it to work.

raster_names_band1 <- c(NWS1_daily, NWS1_prev30, NWS1_prev90, NWS1_prev120, NWS1_prev365)

# Perform zonal statistics on each county to acquire mean observation for each countt
observations <- vector()
raster_names_band1 <- vector()
filenames <- vector()
for (i in 1:length(allShapes))
  {
for (j in 1:length(raster_names_band1))
  observations[i] <- c(extract(raster_names_band1[[j]], allShapes[[i]], fun=mean, na.rm=TRUE, weights=T, normalizeWeights=F))
  filenames<-files[i]
}
filenames <- list.files(input_path, pattern="[.]shp$") 
observations<-data.frame(observations,filenames,stringsAsFactors=FALSE)

## Error in raster_names_band1[[j]] : subscript out of bounds

## Error in data.frame(observations, filenames, stringsAsFactors = FALSE) : 
## arguments imply differing number of rows: 0, 120

I appreciate any advice you may have!

Upvotes: 2

Views: 112

Answers (1)

Parfait
Parfait

Reputation: 107567

Because RasterLayers (the return of raster()) are special S4 class objects, binding them together with c() as in c(NWS1_daily, NWS1_prev30, NWS1_prev90, NWS1_prev120, NWS1_prev365) may not produce a list of five objects but possibly a simplified array of various dimensions. Inspect this object with str(raster_names_band1). For complex objects, use list() to bind together in a collection.

Additionally, instead of nested for loops requiring bookkeeping of initializing objects and assigning to them, consider a nested apply family method. Specifically, run lapply to iterate across raster objects and nested sapply to iterate through allShapes to build observations for in data.frame call. Since filenames are the same for all, assign it once as a global object. Even wrap call with setNames to render a named list.

# LIST OF RASTER OBJECTS (USE list() NOT c())
raster_list <- list(NWS1_daily, NWS1_prev30, NWS1_prev90, NWS1_prev120, NWS1_prev365)
filenames <- list.files(input_path, pattern="[.]shp$"))

# USER-DEFINED FUNCTION
df_build <- function(obj_raster) {
   observations <- sapply(allShapes, function(s) 
                               c(extract(obj_raster, s, fun=mean, na.rm=TRUE, 
                                         weights=TRUE, normalizeWeights=FALSE))
                          )
   data.frame(observations, filenames, stringsAsFactors=FALSE)
} 

# BUILD NAMED LIST OF DFs
df_list <- setNames(lapply(raster_list, df_build),
                    c("NWS1_daily", "NWS1_prev30", "NWS1_prev90", 
                      "NWS1_prev120", "NWS1_prev365")
                   )

# DF OUTPUTS
df_list$NWS1_daily
df_list$NWS1_prev30
df_list$NWS1_prev90
...

In fact, you can run nested sapply (wrapper to lapply) to avoid the list of raster objects and setNames but use a character vector of raster object names and get() to reference object by string:

raster_names <- c("NWS1_daily", "NWS1_prev30", "NWS1_prev90",
                  "NWS1_prev120", "NWS1_prev365")
filenames <- list.files(input_path, pattern="[.]shp$"))

df_build <- function(raster_nm) {
   # keep all same but change: extract(obj_raster, s, ... --> extract(get(raster_nm), s, ...
}

# BUILD NAMED LIST OF DFs
df_list <- sapply(raster_names, df_build, simplify = FALSE)

Note: above is untested code and may require various adjustments depending on global objects.

Upvotes: 2

Related Questions