Wei Liao
Wei Liao

Reputation: 35

How to adjust population data in a raster dataset using census data from an administrative boundary shapefile

I downloaded 100m resolution population raster data from WorldPop, and through transprojection and cropping, I got a city's population spatial distribution raster data, in addition, I have the administrative boundary shapefile data of the county level of this city, and my purpose is to correct the population raster based on the census (county level) data of the year.

##Mask and crop one county from the population data
popc <- trim(mask(pop, county[1, ]))
##Calculate the correction factor and make correction
POPc <- popc*(county[1, ]$census_data/global(popc, sum, na.rm = T)[1,])

I plan to crop out the population rasters for each county separately, then correct them by calculating their correction factors, and finally merge the rasters. However, this way is time-consuming, I wonder if there is an easier way to do this?

Upvotes: 2

Views: 106

Answers (1)

L Tyrone
L Tyrone

Reputation: 6885

If your intention is to use the ratio between the county populations from WorldPop and county$census_data to adjust your WorldPop cell values, you can do this using raster arithmetic:

  1. Create a new version of the WorldPop raster where each cell represents the WorldPop county population it falls within e.g. pop1;
  2. Using the original WorldPop raster as a reference, use rasterize() to create a raster version of the county shapefile with corresponding county$census_data population values e.g. county_r;
  3. Use county_r and pop1 to define the ratio to adjust the cell values in pop

Note that this example uses the default CRS from the tigris data, and is at a lower resolution than your WorldPop data for the sake of speed and simplicity.

Step 0: Load packages and create example data

library(sf) # 1.0-16 
library(terra) # 1.7.71
library(tigris) # 2.1
library(dplyr) # 1.1.4

# Create example city by county sf data using tigris package
# Add ID to your actual data for left_join() in Step 1
county <- counties("Maine", cb = TRUE) %>%
  mutate(ID = 1:n())

# Get extent of county and use values to create a proxy WorldPop SpatRaster
st_bbox(county)

#      xmin      ymin      xmax      ymax 
# -71.08392  42.97776 -66.94989  47.45969

set.seed(1)
pop <- rast(resolution = 100/11113.9, nlyrs = 1,
            xmin = -71.08392, xmax = -66.94989, 
            ymin = 42.97776, ymax = 47.45969, 
            names = "population",
            crs = crs(county)) %>%
  init("cell") %>%
  crop(county, mask = TRUE)

pop[] <- sample(1:100, nrow(pop) * ncol(pop), replace = TRUE)

pop
# class       : SpatRaster
# dimensions  : 498, 459, 1  (nrow, ncol, nlyr)
# resolution  : 0.008997742, 0.008997742  (x, y)
# extent      : -71.08392, -66.95396, 42.97776, 47.45864  (xmin, xmax, ymin, ymax)
# coord. ref. : lon/lat NAD83 (EPSG:4269)
# source(s)   : memory
# name        : population
# min value   :          1
# max value   :        100

Step 1: Create population by county raster from WorldPop data

pop1 <- extract(pop, county, fun = sum, na.rm = TRUE) %>%
  left_join(county, ., by = "ID") %>%
  rasterize(., pop, field = "population")

Step 2: rasterize() county shapefile with county$census population values to corresponding cells

county_r <- extract(pop, county, fun = sum, na.rm = TRUE) %>%
  left_join(county, ., by = "ID") %>%
  mutate(census_data = ceiling(population * 1.1)) %>% # For illustrative purposes, delete this line for your actual data
  rasterize(., pop, field = "census_data")

Step 3: Adjust original WorldPop values using county_r and pop1

POPc <- pop * (county_r / pop1)

POPc
# class       : SpatRaster
# dimensions  : 498, 459, 1  (nrow, ncol, nlyr)
# resolution  : 0.008997742, 0.008997742  (x, y)
# extent      : -71.08392, -66.95396, 42.97776, 47.45864  (xmin, xmax, ymin, ymax)
# coord. ref. : lon/lat NAD83 (EPSG:4269)
# source(s)   : memory
# name        : population
# min value   :     1.1000
# max value   :   110.0018

Upvotes: 1

Related Questions