T. Shaffner
T. Shaffner

Reputation: 419

osmnx get_nearest_edges function results unclear on which key

The get_nearest_edges function in the python mapping package osmnx seems to return results based only on the nodes, u and v. In some cases the map downloading functions return multiple edges between two nodes though. How do I tell which of these edges the point was closest to?

For example, using data from the following graph:

G = ox.graph_from_point((38.75,-77.15), distance=(5*1609.34),distance_type='bbox',simplify=True, network_type='drive', retain_all=True,truncate_by_edge=True,clean_periphery=True)

And pull in the resulting network the node G[63441180][63441059], it shows two edges with keys 1 and 2 and different geometries (it appears to be two separate parts of a loop if you graph the geometry). Here's the node:

AtlasView({1: {'osmid': 8808231, 'name': 'Fort Hunt Park Loop', 'highway': 'unclassified', 'oneway': False, 'length': 1698.8920000000003, 'geometry': <shapely.geometry.linestring.LineString object at 0x7f1250b8e3c8>}, 2: {'osmid': 8808231, 'name': 'Fort Hunt Park Loop', 'highway': 'unclassified', 'oneway': False, 'length': 289.381, 'geometry': <shapely.geometry.linestring.LineString object at 0x7f1250b8e320>}})

If I were to search for the edges closest to a list of points however the results I get are only u and v, and thus don't tell me which edge between these two points (i.e. which key) is the correct one.

Am I missing something? Or is this a bug?

Upvotes: 0

Views: 3354

Answers (1)

T. Shaffner
T. Shaffner

Reputation: 419

UPDATE: As noted by gboeing in the comment above, the most recent release of the library has fixed this bug.

With further research, it seems to me this is likely indeed a bug. Accordingly I've opened an issue for this with the library at https://github.com/gboeing/osmnx/issues/435. Anyone who wants to follow the progress of this can do so there.

As noted there, I've also written a temporary fix for my own purposes to get this information out correctly unless/until the library is changed (or my understanding of the situation is corrected). The code for that is below, including also the additional imports needed to make the functions function correctly. The documentation with it has not been updated however.

from shapely.geometry import Point
from osmnx import redistribute_vertices
import logging as lg
from osmnx.utils import log
import time
from scipy.spatial import cKDTree
from sklearn.neighbors import BallTree

def get_nearest_edge(G, point,return_key=False):
    """
    Return the nearest edge to a pair of coordinates. Pass in a graph and a tuple
    with the coordinates. We first get all the edges in the graph. Secondly we compute
    the euclidean distance from the coordinates to the segments determined by each edge.
    The last step is to sort the edge segments in ascending order based on the distance
    from the coordinates to the edge. In the end, the first element in the list of edges
    will be the closest edge that we will return as a tuple containing the shapely
    geometry and the u, v nodes.
    Parameters
    ----------
    G : networkx multidigraph
    point : tuple
        The (lat, lng) or (y, x) point for which we will find the nearest edge
        in the graph
    Returns
    -------
    closest_edge_to_point : tuple (shapely.geometry, u, v)
        A geometry object representing the segment and the coordinates of the two
        nodes that determine the edge section, u and v, the OSM ids of the nodes.
    """
    start_time = time.time()

    gdf = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)
    if return_key:
        graph_edges = gdf[["geometry", "u", "v","key"]].values.tolist()
    else:
        graph_edges = gdf[["geometry", "u", "v"]].values.tolist()


    edges_with_distances = [
        (
            graph_edge,
            Point(tuple(reversed(point))).distance(graph_edge[0])
        )
        for graph_edge in graph_edges
    ]

    edges_with_distances = sorted(edges_with_distances, key=lambda x: x[1])
    closest_edge_to_point = edges_with_distances[0][0]

    if return_key:
        geometry, u, v,key = closest_edge_to_point
    else:
        geometry, u, v = closest_edge_to_point

    log('Found nearest edge ({}) to point {} in {:,.2f} seconds'.format((u, v), point, time.time() - start_time))

    if return_key:
        return geometry, u, v, key
    else:
        return geometry, u, v


def get_nearest_edges(G, X, Y, method=None, dist=0.0001,return_key=False):
    """
    Return the graph edges nearest to a list of points. Pass in points
    as separate vectors of X and Y coordinates. The 'kdtree' method
    is by far the fastest with large data sets, but only finds approximate
    nearest edges if working in unprojected coordinates like lat-lng (it
    precisely finds the nearest edge if working in projected coordinates).
    The 'balltree' method is second fastest with large data sets, but it
    is precise if working in unprojected coordinates like lat-lng. As a
    rule of thumb, if you have a small graph just use method=None. If you 
    have a large graph with lat-lng coordinates, use method='balltree'.
    If you have a large graph with projected coordinates, use 
    method='kdtree'. Note that if you are working in units of lat-lng,
    the X vector corresponds to longitude and the Y vector corresponds
    to latitude.
    Parameters
    ----------
    G : networkx multidigraph
    X : list-like
        The vector of longitudes or x's for which we will find the nearest
        edge in the graph. For projected graphs, use the projected coordinates,
        usually in meters.
    Y : list-like
        The vector of latitudes or y's for which we will find the nearest
        edge in the graph. For projected graphs, use the projected coordinates,
        usually in meters.
    method : str {None, 'kdtree', 'balltree'}
        Which method to use for finding nearest edge to each point.
        If None, we manually find each edge one at a time using
        osmnx.utils.get_nearest_edge. If 'kdtree' we use
        scipy.spatial.cKDTree for very fast euclidean search. Recommended for
        projected graphs. If 'balltree', we use sklearn.neighbors.BallTree for
        fast haversine search. Recommended for unprojected graphs.
    dist : float
        spacing length along edges. Units are the same as the geom; Degrees for
        unprojected geometries and meters for projected geometries. The smaller
        the value, the more points are created.
    Returns
    -------
    ne : ndarray
        array of nearest edges represented by their startpoint and endpoint ids,
        u and v, the OSM ids of the nodes.
    Info
    ----
    The method creates equally distanced points along the edges of the network.
    Then, these points are used in a kdTree or BallTree search to identify which
    is nearest.Note that this method will not give the exact perpendicular point
    along the edge, but the smaller the *dist* parameter, the closer the solution
    will be.
    Code is adapted from an answer by JHuw from this original question:
    https://gis.stackexchange.com/questions/222315/geopandas-find-nearest-point
    -in-other-dataframe
    """
    start_time = time.time()

    if method is None:
        # calculate nearest edge one at a time for each (y, x) point
        ne = [get_nearest_edge(G, (y, x),return_key) for x, y in zip(X, Y)]
        if return_key:
            ne = [(u, v,k) for _, u, v,k in ne]
        else:
            ne = [(u, v) for _, u, v in ne]

    elif method == 'kdtree':

        # check if we were able to import scipy.spatial.cKDTree successfully
        if not cKDTree:
            raise ImportError('The scipy package must be installed to use this optional feature.')

        # transform graph into DataFrame
        edges = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)

        # transform edges into evenly spaced points
        edges['points'] = edges.apply(lambda x: redistribute_vertices(x.geometry, dist), axis=1)

        # develop edges data for each created points
        extended = edges['points'].apply([pd.Series]).stack().reset_index(level=1, drop=True).join(edges).reset_index()

        # Prepare btree arrays
        nbdata = np.array(list(zip(extended['Series'].apply(lambda x: x.x),
                                   extended['Series'].apply(lambda x: x.y))))

        # build a k-d tree for euclidean nearest node search
        btree = cKDTree(data=nbdata, compact_nodes=True, balanced_tree=True)

        # query the tree for nearest node to each point
        points = np.array([X, Y]).T
        dist, idx = btree.query(points, k=1)  # Returns ids of closest point
        eidx = extended.loc[idx, 'index']
        if return_key:
            ne = edges.loc[eidx, ['u', 'v','key']]
        else:
            ne = edges.loc[eidx, ['u', 'v']]

    elif method == 'balltree':

        # check if we were able to import sklearn.neighbors.BallTree successfully
        if not BallTree:
            raise ImportError('The scikit-learn package must be installed to use this optional feature.')

        # transform graph into DataFrame
        edges = graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)

        # transform edges into evenly spaced points
        edges['points'] = edges.apply(lambda x: redistribute_vertices(x.geometry, dist), axis=1)

        # develop edges data for each created points
        extended = edges['points'].apply([pd.Series]).stack().reset_index(level=1, drop=True).join(edges).reset_index()

        # haversine requires data in form of [lat, lng] and inputs/outputs in units of radians
        nodes = pd.DataFrame({'x': extended['Series'].apply(lambda x: x.x),
                              'y': extended['Series'].apply(lambda x: x.y)})
        nodes_rad = np.deg2rad(nodes[['y', 'x']].values.astype(np.float))
        points = np.array([Y, X]).T
        points_rad = np.deg2rad(points)

        # build a ball tree for haversine nearest node search
        tree = BallTree(nodes_rad, metric='haversine')

        # query the tree for nearest node to each point
        idx = tree.query(points_rad, k=1, return_distance=False)
        eidx = extended.loc[idx[:, 0], 'index']
        if return_key:
            ne = edges.loc[eidx, ['u', 'v','key']]
        else:
            ne = edges.loc[eidx, ['u', 'v']]

    else:
        raise ValueError('You must pass a valid method name, or None.')

    log('Found nearest edges to {:,} points in {:,.2f} seconds'.format(len(X), time.time() - start_time))

    return np.array(ne)```

Upvotes: 1

Related Questions