Reputation: 419
The get_nearest_edges function in the python mapping package osmnx seems to return results based only on the nodes, u and v. In some cases the map downloading functions return multiple edges between two nodes though. How do I tell which of these edges the point was closest to?
For example, using data from the following graph:
G = ox.graph_from_point((38.75,-77.15), distance=(5*1609.34),distance_type='bbox',simplify=True, network_type='drive', retain_all=True,truncate_by_edge=True,clean_periphery=True)
And pull in the resulting network the node G[63441180][63441059], it shows two edges with keys 1 and 2 and different geometries (it appears to be two separate parts of a loop if you graph the geometry). Here's the node:
AtlasView({1: {'osmid': 8808231, 'name': 'Fort Hunt Park Loop', 'highway': 'unclassified', 'oneway': False, 'length': 1698.8920000000003, 'geometry': <shapely.geometry.linestring.LineString object at 0x7f1250b8e3c8>}, 2: {'osmid': 8808231, 'name': 'Fort Hunt Park Loop', 'highway': 'unclassified', 'oneway': False, 'length': 289.381, 'geometry': <shapely.geometry.linestring.LineString object at 0x7f1250b8e320>}})
If I were to search for the edges closest to a list of points however the results I get are only u and v, and thus don't tell me which edge between these two points (i.e. which key) is the correct one.
Am I missing something? Or is this a bug?
Upvotes: 0
Views: 3354
Reputation: 419
UPDATE: As noted by gboeing in the comment above, the most recent release of the library has fixed this bug.
With further research, it seems to me this is likely indeed a bug. Accordingly I've opened an issue for this with the library at https://github.com/gboeing/osmnx/issues/435. Anyone who wants to follow the progress of this can do so there.
As noted there, I've also written a temporary fix for my own purposes to get this information out correctly unless/until the library is changed (or my understanding of the situation is corrected). The code for that is below, including also the additional imports needed to make the functions function correctly. The documentation with it has not been updated however.
from shapely.geometry import Point
from osmnx import redistribute_vertices
import logging as lg
from osmnx.utils import log
import time
from scipy.spatial import cKDTree
from sklearn.neighbors import BallTree
def get_nearest_edge(G, point,return_key=False):
"""
Return the nearest edge to a pair of coordinates. Pass in a graph and a tuple
with the coordinates. We first get all the edges in the graph. Secondly we compute
the euclidean distance from the coordinates to the segments determined by each edge.
The last step is to sort the edge segments in ascending order based on the distance
from the coordinates to the edge. In the end, the first element in the list of edges
will be the closest edge that we will return as a tuple containing the shapely
geometry and the u, v nodes.
Parameters
----------
G : networkx multidigraph
point : tuple
The (lat, lng) or (y, x) point for which we will find the nearest edge
in the graph
Returns
-------
closest_edge_to_point : tuple (shapely.geometry, u, v)
A geometry object representing the segment and the coordinates of the two
nodes that determine the edge section, u and v, the OSM ids of the nodes.
"""
start_time = time.time()
gdf = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)
if return_key:
graph_edges = gdf[["geometry", "u", "v","key"]].values.tolist()
else:
graph_edges = gdf[["geometry", "u", "v"]].values.tolist()
edges_with_distances = [
(
graph_edge,
Point(tuple(reversed(point))).distance(graph_edge[0])
)
for graph_edge in graph_edges
]
edges_with_distances = sorted(edges_with_distances, key=lambda x: x[1])
closest_edge_to_point = edges_with_distances[0][0]
if return_key:
geometry, u, v,key = closest_edge_to_point
else:
geometry, u, v = closest_edge_to_point
log('Found nearest edge ({}) to point {} in {:,.2f} seconds'.format((u, v), point, time.time() - start_time))
if return_key:
return geometry, u, v, key
else:
return geometry, u, v
def get_nearest_edges(G, X, Y, method=None, dist=0.0001,return_key=False):
"""
Return the graph edges nearest to a list of points. Pass in points
as separate vectors of X and Y coordinates. The 'kdtree' method
is by far the fastest with large data sets, but only finds approximate
nearest edges if working in unprojected coordinates like lat-lng (it
precisely finds the nearest edge if working in projected coordinates).
The 'balltree' method is second fastest with large data sets, but it
is precise if working in unprojected coordinates like lat-lng. As a
rule of thumb, if you have a small graph just use method=None. If you
have a large graph with lat-lng coordinates, use method='balltree'.
If you have a large graph with projected coordinates, use
method='kdtree'. Note that if you are working in units of lat-lng,
the X vector corresponds to longitude and the Y vector corresponds
to latitude.
Parameters
----------
G : networkx multidigraph
X : list-like
The vector of longitudes or x's for which we will find the nearest
edge in the graph. For projected graphs, use the projected coordinates,
usually in meters.
Y : list-like
The vector of latitudes or y's for which we will find the nearest
edge in the graph. For projected graphs, use the projected coordinates,
usually in meters.
method : str {None, 'kdtree', 'balltree'}
Which method to use for finding nearest edge to each point.
If None, we manually find each edge one at a time using
osmnx.utils.get_nearest_edge. If 'kdtree' we use
scipy.spatial.cKDTree for very fast euclidean search. Recommended for
projected graphs. If 'balltree', we use sklearn.neighbors.BallTree for
fast haversine search. Recommended for unprojected graphs.
dist : float
spacing length along edges. Units are the same as the geom; Degrees for
unprojected geometries and meters for projected geometries. The smaller
the value, the more points are created.
Returns
-------
ne : ndarray
array of nearest edges represented by their startpoint and endpoint ids,
u and v, the OSM ids of the nodes.
Info
----
The method creates equally distanced points along the edges of the network.
Then, these points are used in a kdTree or BallTree search to identify which
is nearest.Note that this method will not give the exact perpendicular point
along the edge, but the smaller the *dist* parameter, the closer the solution
will be.
Code is adapted from an answer by JHuw from this original question:
https://gis.stackexchange.com/questions/222315/geopandas-find-nearest-point
-in-other-dataframe
"""
start_time = time.time()
if method is None:
# calculate nearest edge one at a time for each (y, x) point
ne = [get_nearest_edge(G, (y, x),return_key) for x, y in zip(X, Y)]
if return_key:
ne = [(u, v,k) for _, u, v,k in ne]
else:
ne = [(u, v) for _, u, v in ne]
elif method == 'kdtree':
# check if we were able to import scipy.spatial.cKDTree successfully
if not cKDTree:
raise ImportError('The scipy package must be installed to use this optional feature.')
# transform graph into DataFrame
edges = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)
# transform edges into evenly spaced points
edges['points'] = edges.apply(lambda x: redistribute_vertices(x.geometry, dist), axis=1)
# develop edges data for each created points
extended = edges['points'].apply([pd.Series]).stack().reset_index(level=1, drop=True).join(edges).reset_index()
# Prepare btree arrays
nbdata = np.array(list(zip(extended['Series'].apply(lambda x: x.x),
extended['Series'].apply(lambda x: x.y))))
# build a k-d tree for euclidean nearest node search
btree = cKDTree(data=nbdata, compact_nodes=True, balanced_tree=True)
# query the tree for nearest node to each point
points = np.array([X, Y]).T
dist, idx = btree.query(points, k=1) # Returns ids of closest point
eidx = extended.loc[idx, 'index']
if return_key:
ne = edges.loc[eidx, ['u', 'v','key']]
else:
ne = edges.loc[eidx, ['u', 'v']]
elif method == 'balltree':
# check if we were able to import sklearn.neighbors.BallTree successfully
if not BallTree:
raise ImportError('The scikit-learn package must be installed to use this optional feature.')
# transform graph into DataFrame
edges = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)
# transform edges into evenly spaced points
edges['points'] = edges.apply(lambda x: redistribute_vertices(x.geometry, dist), axis=1)
# develop edges data for each created points
extended = edges['points'].apply([pd.Series]).stack().reset_index(level=1, drop=True).join(edges).reset_index()
# haversine requires data in form of [lat, lng] and inputs/outputs in units of radians
nodes = pd.DataFrame({'x': extended['Series'].apply(lambda x: x.x),
'y': extended['Series'].apply(lambda x: x.y)})
nodes_rad = np.deg2rad(nodes[['y', 'x']].values.astype(np.float))
points = np.array([Y, X]).T
points_rad = np.deg2rad(points)
# build a ball tree for haversine nearest node search
tree = BallTree(nodes_rad, metric='haversine')
# query the tree for nearest node to each point
idx = tree.query(points_rad, k=1, return_distance=False)
eidx = extended.loc[idx[:, 0], 'index']
if return_key:
ne = edges.loc[eidx, ['u', 'v','key']]
else:
ne = edges.loc[eidx, ['u', 'v']]
else:
raise ValueError('You must pass a valid method name, or None.')
log('Found nearest edges to {:,} points in {:,.2f} seconds'.format(len(X), time.time() - start_time))
return np.array(ne)```
Upvotes: 1