rafa.pereira
rafa.pereira

Reputation: 13807

Calculating a distance matrix based on spatial link distances (aka neighbor distances)

I would like to calculate a distance matrix between the centroids of the polygons in a region considering spatial link distance (aka 'neighbor distances') instead of simple Euclidean distances. Spatial link distance considers the Euclidean distances along the links of spatial neighbor links.

In other words, I would like to calculate spatial distance link matrix. Is there a package / function to do this?

I've done a quick exploration with sfdep and here's the solution I've found in the repex below:

reprex

Step 1: calculate distances between neighboring polygons (dists) using sfdep::st_nb_dists():

library(sf)
library(sfdep)
library(data.table)
library(fields)

# get contiguity and distance btwn neighbors
geo <- sf::st_geometry(guerry)
nb <- sfdep::st_contiguity(geo)
dists <- sfdep::st_nb_dists(geo, nb)

Step 2: I've created a nb_list_to_df() function to convert dists into a data.frame in long format with the distance for every pair or neighboring polygons (od_df)

# fun to convert nb dist to a data.frame in long format
nb_list_to_df <- function(nb, dists) {
  
  mtrx <- sfdep::wt_as_matrix(nb, dists)
  
  matrix_length <- 1:length(mtrx[1,])
  
  # FULL MATRIX
  mtrx_long <- cbind(
    as.data.table(
      data.table::CJ(matrix_length, matrix_length)), # df two columns
    'dist' = as.vector(mtrx)  # matrix values in a vector
  )
  
  # keep only dist between neighbors
  mtrx_long <- subset(mtrx_long, dist >0)
  setnames(mtrx_long, c('from', 'to', 'dist'))
  
  return(mtrx_long)
}

# convert nb dist to a data.frame in long format
od_df <- nb_list_to_df(nb, dists)
head(od_df)
#>    orig dest     dist
#> 1:    1   36 90030.63
#> 2:    1   37 87399.28
#> 3:    1   67 55587.69
#> 4:    1   69 85693.43
#> 5:    2    7 85221.38
#> 6:    2   49 75937.49

The final steps are to:

  1. Use od_df to create a network graph using a library like igraph. In this case, I'm using cppRouting because it seems to be the most efficient.
  2. Use the network topology to calculate the distance between all combinations of origin-destination pairs
# step 3: create a network graph
library(cppRouting)
graph  <-  makegraph(od_df, directed = F)

# step 4: get the distances considering network topology
dist_link <- get_distance_matrix(Graph=graph, 
                    from = unique(od_df$orig), 
                    to = unique(od_df$orig))

This code arrives at the desired solution. However, I'm wondering if this is a good approach or if there are better / more efficient ways to do this.

Upvotes: 0

Views: 172

Answers (1)

Duccio A
Duccio A

Reputation: 1562

As I understand you are concerned that expanding the nb object to a matrix would demand too much RAM, which means finding a different solution to the first part of your code. Here's a version without expanding the nb object into a matrix:

library(sf)
library(sfdep)
library(data.table)

# get contiguity and distance between neighbors
geo = sf::st_geometry(guerry)
nb = sfdep::st_contiguity(geo)
dists = sfdep::st_nb_dists(geo, nb)


# If you look at the structure of nb, it's a list the lenght of
# nrow(geo) and each element contains the index of the neighbours
# of each row.
str(nb)
#> List of 85
#>  $ : int [1:4] 36 37 67 69
#>  $ : int [1:6] 7 49 57 58 73 76
#>  $ : int [1:6] 17 21 40 56 61 69
#>  $ : int [1:4] 5 24 79 80
#>  ---
#>  $ : int [1:6] 15 34 35 47 75 83
#>  $ : int [1:6] 15 18 21 22 34 82
#>  $ : int [1:6] 50 52 53 65 66 68
#>  $ : int [1:5] 9 19 43 56 73
#>  - attr(*, "class")= chr [1:2] "nb" "list"
#>  - attr(*, "region.id")= chr [1:85] "1" "2" "3" "4" ...
#>  - attr(*, "call")= language spdep::poly2nb(pl = geometry, queen = queen)
#>  - attr(*, "type")= chr "queen"
#>  - attr(*, "sym")= logi TRUE
# The same goes for dists
str(dists)
#> List of 85
#>  $ : num [1:4] 90031 87399 55588 85693
#>  $ : num [1:6] 85221 75937 125737 86797 102273 ...
#>  $ : num [1:6] 85461 92563 96614 88010 66400 ...
#>  $ : num [1:4] 53294 98250 83762 83974
#> ---
#>  $ : num [1:6] 110004 84644 69201 108549 60336 ...
#>  $ : num [1:6] 89353 81323 69322 96794 102964 ...
#>  $ : num [1:6] 85539 90350 116728 95918 71842 ...
#>  $ : num [1:5] 65601 96398 100014 82309 95857
#>  - attr(*, "class")= chr [1:2] "nbdist" "list"
#>  - attr(*, "call")= language spdep::nbdists(nb = nb, coords = x, longlat = longlat)

# convert nb dist to a data.frame in long format
# You can loop through geo's row and collect destinations and distances as you go
n = length(nb)
res = data.table(from = integer(), to = integer(), dist = numeric())
for(i in seq_len(n)){
  res = rbind(res, data.table(from = i, to = nb[[i]], dist = dists[[i]]))
}
res
#>      from to      dist
#>   1:    1 36  90030.63
#>   2:    1 37  87399.28
#>   3:    1 67  55587.69
#>   4:    1 69  85693.43
#>   5:    2  7  85221.38
#>  ---                  
#> 416:   85  9  65601.32
#> 417:   85 19  96397.60
#> 418:   85 43 100013.94
#> 419:   85 56  82308.90
#> 420:   85 73  95857.49

Created on 2023-06-13 with reprex v2.0.2

This method appends a data.table to the initial null table at each iteration, it's probably not super fast but should be RAM efficient. It can be optimised if you run into speed problems.

Upvotes: 1

Related Questions