Gianni Spear
Gianni Spear

Reputation: 7936

Finding 1st order neighbors using shapefile polygons

I am looking a efficient way to find the 1st order neighbors of a given polygon. My data are in shapefile format.

My first idea was to calculate the x and y coordinates of the polygons' centroids in order to find the neighbor's centroids.

import pysal
from pysal.common import *
import pysal.weights
import numpy as np
from scipy import sparse,float32
import scipy.spatial
import os, gc, operator


def get_points_array_from_shapefile(inFile):
    """
    Gets a data array of x and y coordinates from a given shape file

    Parameters
    ----------
    shapefile: string name of a shape file including suffix

    Returns
    -------
    points: array (n,2) a data array of x and y coordinates

    Notes
    -----
    If the given shape file includes polygons,
    this function returns x and y coordinates of the polygons' centroids

    Examples
    --------
    Point shapefile
    >>> from pysal.weights.util import get_points_array_from_shapefile
    >>> xy = get_points_array_from_shapefile('../examples/juvenile.shp')
    >>> xy[:3]
    array([[ 94.,  93.],
           [ 80.,  95.],
           [ 79.,  90.]])

    Polygon shapefile
    >>> xy = get_points_array_from_shapefile('../examples/columbus.shp')
    >>> xy[:3]
    array([[  8.82721847,  14.36907602],
           [  8.33265837,  14.03162401],
           [  9.01226541,  13.81971908]])

    (source: https://code.google.com/p/pysal/source/browse/trunk/pysal/weights/util.py?r=1013)

    """
    f = pysal.open(inFile)
    shapes = f.read()
    if f.type.__name__ == 'Polygon':
        data = np.array([shape.centroid for shape in shapes])
    elif f.type.__name__ == 'Point':
        data = np.array([shape for shape in shapes])
    f.close()
    return data


inFile = "../examples/myshapefile.shp"
my_centr = get_points_array_from_shapefile(inFile)

This approach could be valid for a regular grid but in my case, I need to find a "more general" solution. The figure shows the problem. Consider the yellow polygon has the referee. The neighbor's polygons are the gray polygons. Using the centroids-neighbors approach, the clear blue polygon is considered a neighbor but it doesn't have a side in common with the yellow polygon.

A recent solution modified from Efficiently finding the 1st order neighbors of 200k polygons can be the following:

from collections import defaultdict
inFile = 'C:\\MultiShapefile.shp'

shp = osgeo.ogr.Open(inFile)
layer = shp.GetLayer()
BlockGroupVertexDictionary = dict()
for index in xrange(layer.GetFeatureCount()):
    feature = layer.GetFeature(index)
    FID = str(feature.GetFID())
    geometry = feature.GetGeometryRef()
    pts = geometry.GetGeometryRef(0)
    # delete last points because is the first (see shapefile polygon topology)
    for p in xrange(pts.GetPointCount()-1):
        PointText = str(pts.GetX(p))+str(pts.GetY(p))
        # If coordinate is already in dictionary, append this BG's ID
        if PointText in BlockGroupVertexDictionary:
            BlockGroupVertexDictionary[PointText].append(FID)
        # If coordinate is not already in dictionary, create new list with this BG's ID
        else:
            BlockGroupVertexDictionary[PointText] = [FID]

With this solution, I have a dictionary with vertex coordinates as the keys and a list of block group IDs that have a vertex at that coordinate as the value.

>>> BlockGroupVertexDictionary
{'558324.3057036361423.57178': ['18'],
 '558327.4401686361422.40755': ['18', '19'],
 '558347.5890836361887.12271': ['1'],
 '558362.8645026361662.38757': ['17', '18'],
 '558378.7836876361760.98381': ['14', '17'],
 '558389.9225016361829.97259': ['14'],
 '558390.1235856361830.41498': ['1', '14'],
 '558390.1870856361652.96599': ['17', '18', '19'],
 '558391.32786361398.67786': ['19', '20'],
 '558400.5058556361853.25597': ['1'],
 '558417.6037156361748.57558': ['14', '15', '17', '19'],
 '558425.0594576362017.45522': ['1', '3'],
 '558438.2518686361813.61726': ['14', '15'],
 '558453.8892486362065.9571': ['3', '5'],
 '558453.9626046361375.4135': ['20', '21'],
 '558464.7845966361733.49493': ['15', '16'],
 '558474.6171066362100.82867': ['4', '5'],
 '558476.3606496361467.63697': ['21'],
 '558476.3607186361467.63708': ['26'],
 '558483.1668826361727.61931': ['19', '20'],
 '558485.4911846361797.12981': ['15', '16'],
 '558520.6376956361649.94611': ['25', '26'],
 '558525.9186066361981.57914': ['1', '3'],
 '558527.5061096362189.80664': ['4'],
 '558529.0036896361347.5411': ['21'],
 '558529.0037236361347.54108': ['26'],
 '558529.8873646362083.17935': ['4', '5'],
 '558533.062376362006.9792': ['1', '3'],
 '558535.4436256361710.90985': ['9', '16', '20'],
 '558535.4437266361710.90991': ['25'],
 '558548.7071816361705.956': ['9', '10'],
 '558550.2603156361432.56769': ['26'],
 '558550.2603226361432.56763': ['21'],
 '558559.5872216361771.26884': ['9', '16'],
 '558560.3288756362178.39003': ['4', '5'],
 '558568.7811926361768.05997': ['1', '9', '10'],
 '558572.749956362041.11051': ['3', '5'],
 '558573.5437016362012.53546': ['1', '3'],
 '558575.3048386362048.77518': ['2', '3'],
 '558576.189546362172.87328': ['5'],
 '558577.1149386361695.34587': ['7', '10'],
 '558579.0999636362020.47297': ['1', '3'],
 '558581.6312396362025.36096': ['0', '1'],
 '558586.7728956362035.28967': ['0', '3'],
 '558589.8015336362043.7987': ['2', '3'],
 '558601.3250076361686.30355': ['7'],
 '558601.3250736361686.30353': ['25'],
 '558613.7793476362164.19871': ['2', '5'],
 '558616.4062876361634.7097': ['7'],
 '558616.4063116361634.70972': ['25'],
 '558618.129066361634.29952': ['7', '11', '22'],
 '558618.1290896361634.2995': ['25'],
 '558626.9644156361875.47515': ['10', '11'],
 '558631.2229836362160.17325': ['2'],
 '558632.0261236361600.77448': ['25', '26'],
 '558639.495586361898.60961': ['11', '13'],
 '558650.4935686361918.91358': ['12', '13'],
 '558659.2473416361624.50945': ['8', '11', '22', '24'],
 '558664.5218136361857.94836': ['7', '10'],
 '558666.4126376361622.80343': ['8', '24'],
 '558675.1439056361912.52276': ['12', '13'],
 '558686.3385396361985.08892': ['0', '1'],
..................
.................
 '558739.4377836361931.57279': ['11', '13'],
 '558746.8758486361973.84475': ['11', '13'],
 '558751.3440576361902.20399': ['6', '11'],
 '558768.8067026361258.4715': ['26'],
 '558779.9170276361961.16408': ['6', '11'],
 '558785.7399596361571.47416': ['22', '24'],
 '558791.5596546361882.09619': ['8', '11'],
 '558800.2351726361877.75843': ['6', '8'],
 '558802.7700816361332.39227': ['26'],
 '558802.770176361332.39218': ['22'],
 '558804.7899976361336.78827': ['22'],
 '558812.9707376361565.14513': ['23', '24'],
 '558833.2667696361940.68932': ['6', '24'],
 '558921.2068976361539.98868': ['22', '23'],
 '558978.3570116361885.00604': ['23', '24'],
 '559022.80716361982.3729': ['23'],
 '559096.8905816361239.42141': ['22'],
 '559130.7573166361935.80614': ['23'],
 '559160.3907086361434.15513': ['22']}

enter image description here

Upvotes: 5

Views: 3911

Answers (2)

Jzl5325
Jzl5325

Reputation: 3974

Just incase this is still an open question for the OP or someone else stumbles here.

import pysal as ps 
w = ps.queen_from_shapefile('shapefile.shp')

http://pysal.readthedocs.io/en/latest/users/tutorials/weights.html#pysal-spatial-weight-types

Upvotes: 8

martineau
martineau

Reputation: 123473

I am not familiar with the specific data formats being used, but regardless, think the following idea would work.

In Python you can make sets out of tuples of numbers, i.e. (x,y) and (x1,y1,x2,y2), so it should be possible to make a set representing all the points or edges in a given polygon. Afterwards you would be able use very fast set intersection operations to find all 1st order neighbors.

You might be able to speed the process up using some sort of trivial rejection test to avoid further processing of polygons which could not possibly be a neighbor -- perhaps using your polygons' centroids idea.

Does this explanation make sense?

Upvotes: 0

Related Questions