Reputation: 13807
I have an sf dataframe
object with a series of points representing the shape of a bus route. I would like to turn this object into a routable graph so I can estimate the time it takes to traverse from point c
to t
.
Here is what I've tried using the dodgr
package but I am not sure what I'm doing wrong here:
library(dodgr)
graph <- weight_streetnet(mydata, wt_profile = "motorcar", type_col="highway" , id_col = "id")
Error in check_highway_osmid(x, wt_profile) : Please specify type_col to be used for weighting streetnet
The data looks like in the image below
mydata <- structure(list(shape_id = c(52421L, 52421L, 52421L, 52421L, 52421L,
52421L, 52421L, 52421L, 52421L, 52421L, 52421L, 52421L, 52421L,
52421L, 52421L, 52421L, 52421L, 52421L, 52421L, 52421L), length = structure(c(0.191422504106197,
0.191422504106197, 0.191422504106197, 0.191422504106197, 0.191422504106197,
0.191422504106197, 0.191422504106197, 0.191422504106197, 0.191422504106197,
0.191422504106197, 0.191422504106197, 0.191422504106197, 0.191422504106197,
0.191422504106197, 0.191422504106197, 0.191422504106197, 0.191422504106197,
0.191422504106197, 0.191422504106197, 0.191422504106197), units = structure(list(
numerator = "km", denominator = character(0)), class = "symbolic_units"), class = "units"),
geometry = structure(list(structure(c(-46.5623281998182,
-23.5213458001468), class = c("XY", "POINT", "sfg")), structure(c(-46.562221,
-23.52129), class = c("XY", "POINT", "sfg")), structure(c(-46.562121,
-23.521235), class = c("XY", "POINT", "sfg")), structure(c(-46.5620233332577,
-23.5211840000609), class = c("XY", "POINT", "sfg")), structure(c(-46.561925666591,
-23.5211330000609), class = c("XY", "POINT", "sfg")), structure(c(-46.561828,
-23.521082), class = c("XY", "POINT", "sfg")), structure(c(-46.5618098335317,
-23.5212126666783), class = c("XY", "POINT", "sfg")), structure(c(-46.5617916670273,
-23.5213433333544), class = c("XY", "POINT", "sfg")), structure(c(-46.5617735004869,
-23.5214740000284), class = c("XY", "POINT", "sfg")), structure(c(-46.5617553339104,
-23.5216046667004), class = c("XY", "POINT", "sfg")), structure(c(-46.5617371672978,
-23.5217353333702), class = c("XY", "POINT", "sfg")), structure(c(-46.5617190006492,
-23.5218660000379), class = c("XY", "POINT", "sfg")), structure(c(-46.5617008339645,
-23.5219966667036), class = c("XY", "POINT", "sfg")), structure(c(-46.5616826672438,
-23.5221273333671), class = c("XY", "POINT", "sfg")), structure(c(-46.5616645004869,
-23.5222580000284), class = c("XY", "POINT", "sfg")), structure(c(-46.5616463336941,
-23.5223886666877), class = c("XY", "POINT", "sfg")), structure(c(-46.5616281668651,
-23.5225193333449), class = c("XY", "POINT", "sfg")), structure(c(-46.56161,
-23.52265), class = c("XY", "POINT", "sfg")), structure(c(-46.5617355000207,
-23.5226427501509), class = c("XY", "POINT", "sfg")), structure(c(-46.5618610000276,
-23.5226355002012), class = c("XY", "POINT", "sfg"))), class = c("sfc_POINT",
"sfc"), precision = 0, bbox = structure(c(xmin = -46.5623281998182,
ymin = -23.52265, xmax = -46.56161, ymax = -23.521082), class = "bbox"), crs = structure(list(
epsg = 4326L, proj4string = "+proj=longlat +datum=WGS84 +no_defs"), class = "crs"), n_empty = 0L),
id = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t"), speed_kmh = c(11,
11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
11, 11, 11, 11)), sf_column = "geometry", agr = structure(c(shape_id = NA_integer_,
length = NA_integer_, id = NA_integer_, speed_kmh = NA_integer_
), class = "factor", .Label = c("constant", "aggregate", "identity"
)), row.names = c("1.13", "1.14", "1.15", "1.16", "1.17", "1.18",
"1.19", "1.20", "1.21", "1.22", "1.23", "1.24", "1.25", "1.26",
"1.27", "1.28", "1.29", "1.30", "1.31", "1.32"), class = c("sf",
"data.table", "data.frame"))
Upvotes: 2
Views: 975
Reputation: 148
If you want to include this in a 'tidy' workflow, you could also consider using a mix between sf
and tidygraph
. The latter offers a tidy framework for networks/graphs, in the form of a tbl_graph
class, which subclasses igraph
(hence, you can use tbl_graph
objects inside all igraph
functions as being an igraph
object). However, you can analyze your nodes and edges as being tibbles, and use functions as filter()
, select()
, mutate()
, et cetera. Of course, these tibbles can also contain a geometry list column that we know from sf
, adding geographic information to the nodes and edges.
The approach is far from perfect yet, and improvements would be very welcome, but still, it shows another way of handling the problem.
# Load libraries.
library(tidyverse)
library(sf)
library(tidygraph)
library(igraph)
library(units)
Just as in the other answers, we need to create edges between the nodes. For now, I assume that the points are simply connected in alphabetical order. For the tidygraph
approach, however, we seem to need numeric ID's rather than characters.
# Add a numeric ID column to the nodes.
nodes <- mydata %>%
rename(id_chr = id) %>%
rowid_to_column("id") %>%
select(id, id_chr, everything())
# Define the source node of each edge, and the target node of each edge.
sources <- nodes %>% slice(-n())
targets <- nodes %>% slice(-1)
# Write a function to create lines between data frames of source and target points.
pt2l <- function(x, y) { st_linestring(rbind(st_coordinates(x), st_coordinates(y))) }
# Create the edges.
edges <- tibble(
from = sources %>% pull(id),
to = targets %>% pull(id),
length = sources %>% pull(length),
speed = sources %>% pull(speed_kmh),
geometry = map2(st_geometry(sources), st_geometry(targets), pt2l)
) %>% st_as_sf() %>% st_set_crs(st_crs(nodes))
# Add a time column to the edges.
edges <- edges %>%
mutate(speed = set_units(speed, "km/h")) %>%
mutate(time = length / speed)
# Clean up the nodes data.
nodes <- nodes %>%
select(-length, -speed_kmh)
# Create the tbl_graph object out of the nodes and edges.
# Providing the edges as sf object is problematic for tidygraph, unfortunately.
# Therefore, we have to provide them as a tibble.
graph <- tbl_graph(nodes = nodes, edges = as_tibble(edges), directed = FALSE)
This gives us the following tbl_graph
object:
# A tbl_graph: 20 nodes and 19 edges
#
# An undirected simple graph with 1 component
#
# Node Data: 20 x 4 (active)
id id_chr shape_id geometry
<int> <chr> <int> <POINT [°]>
1 1 a 52421 (-46.56233 -23.52135)
2 2 b 52421 (-46.56222 -23.52129)
3 3 c 52421 (-46.56212 -23.52124)
4 4 d 52421 (-46.56202 -23.52118)
5 5 e 52421 (-46.56193 -23.52113)
6 6 f 52421 (-46.56183 -23.52108)
# … with 14 more rows
#
# Edge Data: 19 x 6
from to length speed geometry time
<int> <int> [km] [km/h] <LINESTRING [°]> [h]
1 1 2 0.1914225 11 (-46.56233 -23.52135, -46.56222 -23.5… 0.017402…
2 2 3 0.1914225 11 (-46.56222 -23.52129, -46.56212 -23.5… 0.017402…
3 3 4 0.1914225 11 (-46.56212 -23.52124, -46.56202 -23.5… 0.017402…
# … with 16 more rows
Now we have everything in a graph structure, we can select the node we want to route from, and the node we want to route to, and find the shortest path between them with the travel time as weight variable, using the shortest_path
function from igraph
. We now just work with a one-to-one route ('c' to 't'), but it would be just the same for one-to-many, many-to-one or many-to-many.
# Select the node from which and to which the shortest path should be found.
from_node <- graph %>%
activate(nodes) %>%
filter(id_chr == "c") %>%
pull(id)
to_node <- graph %>%
activate(nodes) %>%
filter(id_chr == "t") %>%
pull(id)
# Find the shortest path between these nodes
path <- shortest_paths(
graph = graph,
from = from_node,
to = to_node,
output = 'both',
weights = graph %>% activate(edges) %>% pull(time)
)
The resulting path is a list with the nodes and edges that make up the path.
$vpath
$vpath[[1]]
+ 18/20 vertices, from e43a089:
[1] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
$epath
$epath[[1]]
+ 17/19 edges from e43a089:
[1] 3-- 4 4-- 5 5-- 6 6-- 7 7-- 8 8-- 9 9--10 10--11 11--12 12--13
[11] 13--14 14--15 15--16 16--17 17--18 18--19 19--20
We can create a subgraph of the original graph that only contains the nodes and edges of the shortest path.
path_graph <- graph %>%
subgraph.edges(eids = path$epath %>% unlist()) %>%
as_tbl_graph()
# A tbl_graph: 18 nodes and 17 edges
#
# An undirected simple graph with 1 component
#
# Node Data: 18 x 4 (active)
id id_chr shape_id geometry
<int> <chr> <int> <POINT [°]>
1 3 c 52421 (-46.56212 -23.52124)
2 4 d 52421 (-46.56202 -23.52118)
3 5 e 52421 (-46.56193 -23.52113)
4 6 f 52421 (-46.56183 -23.52108)
5 7 g 52421 (-46.56181 -23.52121)
6 8 h 52421 (-46.56179 -23.52134)
# … with 12 more rows
#
# Edge Data: 17 x 6
from to length speed geometry time
<int> <int> [km] [km/h] <LINESTRING [°]> [h]
1 1 2 0.1914225 11 (-46.56212 -23.52124, -46.56202 -23.5… 0.017402…
2 2 3 0.1914225 11 (-46.56202 -23.52118, -46.56193 -23.5… 0.017402…
3 3 4 0.1914225 11 (-46.56193 -23.52113, -46.56183 -23.5… 0.017402…
# … with 14 more rows
Here, something happens which I don't like. Tidygraph/igraph seems to have an internal node ID structure, and you see that in the subgraph, the from
and to
columns in the egdes data don't match with our id
column in the nodes data anymore, but instead simply refer to the row numbers of the nodes data. I am not sure how to fix that.
Either way, we now have our path from 'c' to 't' as a subgraph, and can easily analyze it. For example, by calculating the total travel time of the path (as was the question).
path_graph %>%
activate(edges) %>%
as_tibble() %>%
summarise(total_time = sum(time))
# A tibble: 1 x 1
total_time
[h]
1 0.2958348
But it is also easy to plot it, with geographic information preserved (just by exporting nodes and edges as sf objects).
ggplot() +
geom_sf(data = graph %>% activate(edges) %>% as_tibble() %>% st_as_sf(), col = 'darkgrey') +
geom_sf(data = graph %>% activate(nodes) %>% as_tibble() %>% st_as_sf(), col = 'darkgrey', size = 0.5) +
geom_sf(data = path_graph %>% activate(edges) %>% as_tibble() %>% st_as_sf(), lwd = 1, col = 'firebrick') +
geom_sf(data = path_graph %>% activate(nodes) %>% filter(id %in% c(from_node, to_node)) %>% as_tibble() %>% st_as_sf(), size = 2)
An r-spatial blog post might be coming up on this tidygraph-sf approach ;)
Upvotes: 3
Reputation: 340
The weight_streetnet
function is really only designed to handle actual street networks, generally as produced by the osmdata::osmdata_sf/sp/sc()
functions. It can nevertheless be tweaked to handle cases like this. The main thing needed is to convert the points into something that knows about edges in between them, like an sf::LINESTRING
object:
x <- sf::st_combine (mydata) %>%
sf::st_cast ("LINESTRING") %>%
sf::st_sf ()
That gives a single-row object which can then be converted to dodgr
format, and the id
values matched back on to the edges
net <- weight_streetnet (x, type_col = "shape_id", id_col = "id", wt_profile = 1)
net$from_id <- mydata$id [as.integer (net$from_id)]
net$to_id <- mydata$id [as.integer (net$to_id)]
At that point, dodgr
will have calculated and inserted the distances directly from the geographic coordinates. Your distances can then also be inserted and used for routing by replacing the d_weighted
values:
net$d_weighted <- as.numeric (mydata$length [1])
dodgr_dists (net, from = "c", to = "t") # 236.0481
If you really want your distances to represent absolute distances used to calculate the final result, then simply replace the $d
values
net$d <- net$d_weighted
dodgr_dists (net, from = "c", to = "t") # 3.254183
Note that for "simple" problems like this, igraph
will generally be faster, because it calculates routes using a single set of weights. The only real advantage of dodgr
in this context is the ability to use "dual weights" - the $d_weighted
and $d
values - such that the route is calculated according to $d_weighted
, and the final distances according to $d
.
Upvotes: 1
Reputation: 1630
I think you can solve it by transforming your data into an igraph object and use the functionalities in the igraph library. You need to establish the Edges and Vertex as well as weight values. In igraph an Edge is a link representing a connection among two nodes (Source and Target). In this case, a link is a "street" and the points are the nodes.
library(igraph)
GraphResult <- data.frame(Source = c(NULL),
Target = c(NULL),
weight = c(NULL))
for (i in 1:(dim(mydata)[1] - 1)) {
TempGraphResult <- data.frame(Source = c(0),
Target = c(0),
weight = c(0))
TempGraphResult$Source[1] <- mydata$id[i]
TempGraphResult$Target[1] <- mydata$id[i + 1]
TempGraphResult$weight[1] <- mydata$length[i]
GraphResult <- rbind(GraphResult, TempGraphResult) }
MyIgraph <- graph_from_data_frame(GraphResult)
#In this case works perfectly. But if you have more weight variables and even
#additional variables for the nodes, igraph have functions for constructing the
#igraph object
distances(MyIgraph, "c", "t") #returns 3.254183. Seems correct (0.1914225*17)
SquareMatrix <- distances(MyIgraph)
#*distances() is a function from igraph that performs the routing calculations.
Is possible to achieve more complex networks and calculate routes. For example, you can set the direction of the roads.
Maybe dodger can handle the problem, but I am not sure.
Upvotes: 1