Olivia
Olivia

Reputation: 191

How to extract the density population of specific longitude and latitude points from the GHSL population data?

I have a data frame with about 20,000 rows containing the latitude and longitude of points spread across the world. I need to determine the population density at those points, so I considered using the GHSL population data. However, I am unsure how to extract that information from the GHSL population data, or if it is even possible. (

Upvotes: 0

Views: 263

Answers (2)

Robert Hijmans
Robert Hijmans

Reputation: 47481

You can use terra::extract

Example data:

library(geodata)
pop <- population(2020, 10, path=tempdir())
points <- cbind(lon=c(10, 15, 20, 25), lat = 5)

Solution:

terra::extract(pop, points)
#  population_density
#1         129.801056
#2           7.619854
#3          15.381500
#4           3.682034

See ?terra::extract for more examples

Upvotes: 2

L Tyrone
L Tyrone

Reputation: 7065

As you haven't provided any example data, information about which GHSL year to use, or your preferred language, here is an R reprex using example data.

I have used the 30arcsec resolution GHSL (~100m cell size) as the 3arcsec option was too large for my computer to handle. If you need a finer resolution, you can download the GHSL tile by tile, but that would be laborious. Given you point data are at global scale, I suspect ~100m will suffice. This example assumes your coordinates are WGS84/EPSG:4326.

First, load required packages and create example point df:

library(rnaturalearth)
library(terra)
library(sf)
library(dplyr)
library(ggplot2)

# Load world polygons and project to WGS84 Pseudo Mercator/EPSG:3857
world <- ne_countries() %>%
  filter(!admin == "Antarctica") %>%
  st_transform(3857)

# Create example points df using world and convert coordinates to WGS84:EPSG4326
set.seed(1)
df <- st_sample(world, size = 20000, type = "random") %>%
  st_as_sf() %>%
  st_transform(4326) %>%
  mutate(ID = 1:n(),
         lon = st_coordinates(.)[,1],
         lat = st_coordinates(.)[,2]) %>%
  data.frame() %>%
  select(-x)

head(df)
#   ID          lon       lat
# 1  1   26.2272108 -1.853224
# 2  2  146.9548044 65.990806
# 3  3 -105.8491530 73.182944
# 4  4 -116.4395691 70.634154
# 5  5   97.1429112 34.367544
# 6  6   -0.8282728 46.360984

Note the addition of the "ID" column. You will need to add this to your actual df as the output from terra::extract() creates an "ID" column, and adding the "ID" column provides a way to join your population density data back to your original data. Using dplyr::left_join() is not strictly necessary, you could just use the pop_den data, but I have assumed you have other columns in your df, so using a join makes things easier to keep track of. The workflow below joins the data to sf_points, but you could also just join pop_den back to your df using the same method.

Next, load GHSL as a SpatRaster:

# Load WGS84/EPSG:4326 GHSL, 30 arcsec, 202 from working directory,
# previously unzipped and downloaded from
# https://human-settlement.emergency.copernicus.eu/download.php
ghsl <- rast("data/GHS_BUILT_S_E2020_GLOBE_R2023A_4326_30ss_V1_0.tif")
ghsl <- ghsl * 1
names(ghsl) <- "pop_density_2020"

ghsl
# class       : SpatRaster 
# dimensions  : 21384, 43201, 1  (nrow, ncol, nlyr)
# resolution  : 0.008333333, 0.008333333  (x, y)
# extent      : -180.0012, 180.0071, -89.10042, 89.09958  (xmin, xmax, ymin, ymax)
# coord. ref. : lon/lat WGS 84 (EPSG:4326) 
# source      : spat_391c2813558f_14620.tif 
# varname     : GHS_BUILT_S_E2020_GLOBE_R2023A_4326_30ss_V1_0 
# name        : pop_density_2020 
# min value   :                0 
# max value   :           848108 

Now create an sf object from your df, extract cell values from ghsl, and join the result to sf_points:

# Create sf from df
sf_points <- st_as_sf(df, coords = c("lon", "lat"), crs = 4326)

# Return population density per point
pop_den <- extract(ghsl, sf_points)

# Join population density to sf_points
sf_points <- left_join(sf_points, pop_den, by = "ID")

# Examine result
filter(sf_points, pop_density_2020 > 0)
# Simple feature collection with 2542 features and 2 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: -155.8613 ymin: -55.07594 xmax: 177.6207 ymax: 72.03816
# Geodetic CRS:  WGS 84
# First 10 features:
#     ID pop_density_2020                    geometry
# 1   22             1365  POINT (-91.87298 35.95215)
# 2   45               21   POINT (21.92427 41.46684)
# 3   53               19  POINT (-80.00786 37.48202)
# 4   62              165   POINT (76.22779 19.11223)
# 5   65               88   POINT (101.8265 24.57473)
# 6   68             1712  POINT (-114.7794 42.88764)
# 7   72              366  POINT (-87.25635 33.31995)
# 8   73            76958   POINT (127.4405 36.84436)
# 9   81              720 POINT (-79.93223 -1.357552)
# 10 107               30  POINT (15.16695 -13.33818)

Upvotes: 2

Related Questions