Rachel
Rachel

Reputation: 1

extracting cell numbers from multiple counties in R

I'm new to R so please excuse any terminology mistakes... I'm trying to extract the cell numbers for every county in the state of Oklahoma and paste them on top of each other so that I can use them to look at different temperatures throughout Oklahoma state. I have a shapefile of counties in the US, so I made a vector of all the county ID numbers for the state of OK. I then tried to extract the cell numbers and max temp values for every county in a loop. That extract line that I wrote works when I do it one county at a time, I think it's the okcounty=rbind line that's the problem but I don't know what the best way to do this is.

Thank you for your help! I really appreciate it.

`okcounties=which(counties$STATE_NAME=="Oklahoma") #contains 58 counties
 county = NULL
 for (i in 1:58){
   countyvalues=extract(OK.tmax[[1]], extent(counties[okcounties[i],]), cellnumbers=T)
   county=rbind(county, countyvalues) #add data from each of 58 counties 
 }`

Upvotes: 0

Views: 90

Answers (1)

Jeffrey Evans
Jeffrey Evans

Reputation: 2417

I am finding your code a bit confusing and can see a few places it is going wrong. You are overthinking things a bit. I am not sure why you are extracting cellnumbers and not just taking advantage of extract and the stack object.

The "okcounties" object could be a sp class subset of the counties object, that you could pass directly to extract eg., okcounties <- counties[counties$STATE_NAME=="Oklahoma",] .

If you drop the call to extent, which is returning a bounding box for each county and not the county boundary, things get much simpler. To leverage the stack you could just let extract provide a data.frame of the raster values. Here is a worked example on synthetic data. I approximated your object naming convention for this example. The final object "ok.county" I believe would be the same as the "county" object that you are trying to create.

First, let's create some example data and plot

library(raster)
library(sp) 

# create polygons
p <- raster(nrow=10, ncol=10)
  p[] <- runif(ncell(p)) * 10
  counties <- rasterToPolygons(p, fun=function(x){x > 9})
  counties$county <- paste0("county",1:nrow(counties))
  counties$STATE_NAME <- c(rep("CA",3),
                          rep("OK",nrow(counties)-3))

# Create raster stack  
r <- raster(nrow=100, ncol=100)
  r[] <- runif(ncell(r), 40,70)
  r <- stack(r, r+5, r+10) # stack 
  names(r) <- c("June", "July", "Aug") 

plot(r[[1]])
  plot(p, add=TRUE, lwd=4) 

We can use an index to subset to the state we are interested in.

ok <- counties[counties@data$STATE_NAME == "OK",]

Now we can use extract on the entire raster stack. The resulting object will be a list where each polygon has its own element in the list containing a data.frame. Each column of the data.frame represents a layer in the raster stack object.

ok.county <- extract(r, ok)
  class(ok.county)
  head(ok.county[[1]])  

However, if you want to collapse the list into a single data.frame, unique polygon identifiers are missing. Here we are going to use the ID column in the SpatialPolygonsDataFrame object. Since the list is ordered the same as the polygon object you can assign unique values from the polygon object. In your case it would likely be the county names and the method would follow the same as the example.

cnames <- unique( counties@data$county )   
  for(i in 1:length(ok.county)) {
    ok.county[[i]] <- data.frame(county = cnames[i], ok.county[[i]])   
  }
head(ok.county[[1]])

Now that we have a unique identifier assigned to each data.frame in the list we can collapse it using do.call.

ok.county <- as.data.frame(do.call("rbind", ok.county))
  str(ok.county)

Using an apply function we can pull the maximum value for a given column (time-period) for each unique ID.

tapply(ok.county[,"June"],  ok.county$county, max)

As to your original code, something like this would work (obviously, not tested) but there is no unique polygon ID tying results back to the county and it is still the bounding box of the county and not the polygon boundaries.

okcounties <- counties[counties$STATE_NAME=="Oklahoma",]
 county = NULL
 for (i in 1:nrow(okcounties)){
   county <- rbind(county, extract(OK.tmax[[1]], 
        extent(okcounties[i,]), cellnumbers=T))
 } 

Upvotes: 1

Related Questions