emax
emax

Reputation: 7235

Python: how to filter the points only in a specific boundary using geopandas?

I have a shapefile with some points.

import geopandas as gpd
from geopandas.tools import sjoin
import osmnx as ox

myShape = gpd.read_file('myShape.shp')
myShape.head(3)

geometry
0   POINT (-72.09513801999077 18.6526410972226)
1   POINT (-72.21044508038457 19.61039786418674)
2   POINT (-72.27903674995586 18.52939294725028)

Then I extracted a city from open street map and its boundaries. I want to keep only the points that are inside the city.

This is what I am doing:

city = ox.gdf_from_place('Paris', which_result=2)
city = gpd.GeoDataFrame(city)

myShapeTmp = myShape.copy()
for i in myShape.index:
    if (myShape['geometry'][i] in gdf['geometry']) == False:
        myShapeTmp = myShapeTmp.drop([i], axis=0)

But it takes forever. Is there a different way to do it?

Upvotes: 1

Views: 4344

Answers (2)

emax
emax

Reputation: 7235

I solved in this way

myShape = gpd.read_file('myShape.shp')
myShape.head(3)

geometry
0   POINT (-72.09513801999077 18.6526410972226)
1   POINT (-72.21044508038457 19.61039786418674)
2   POINT (-72.27903674995586 18.52939294725028)


city = ox.gdf_from_place('Paris', which_result = 2)
city = gpd.GeoDataFrame(city)
boundary = city.ix[0].geometry
myShapeTmp = myShape[myShape.geometry.within(boundary)]

Upvotes: 4

swatchai
swatchai

Reputation: 18782

A different way of doing to get points in polygon can be like this:

Let's create a file port_au_princ.csv with the following content:

index,geometry
0,POINT(-72.09513801999077 18.6526410972226)
1,POINT(-72.21044508038457 19.61039786418674)
2,POINT(-72.27903674995586 18.52939294725028)

Here is the code that does all the necessary operations.

import pandas as pd
import geopandas as gpd
from geopandas.tools import sjoin
import osmnx as ox
from shapely.wkt import loads

# get polygon of the city
city_pgon = ox.gdf_from_place('Port-au-Prince', which_result = 2)
# create a geoDataFrame out of it, set appropriate CRS
city_pgon = gpd.GeoDataFrame(city_pgon, crs={'init': 'epsg:4326'})
# read the file
pap = pd.read_csv('port_au_princ.csv', sep=',')
geometry = [loads(pt) for pt in pap['geometry']]  # prep geometry
# create geoDataFrame from `pap` dataframe
geo_pap = gpd.GeoDataFrame(pap, geometry=geometry, crs={'init': 'epsg:4326'})
pts_in_poly = gpd.sjoin(geo_pap, city_pgon, op='within', how='inner') # do spatial join

ax1 = city_pgon.plot()   # plot the polygon
geo_pap.plot(ax=ax1, color='red', zorder=5)         # plot red points
pts_in_poly.plot(ax=ax1, color='yellow', zorder=6)  # plot yellow point

The plot will be:

point-in-poly

Print out some information of pts_in_poly with this code:

print(pts_in_poly.index, pts_in_poly.geometry)

the output,

Int64Index([2], dtype='int64') 2    POINT (-72.27903674995586 18.52939294725028)
Name: geometry, dtype: object

Upvotes: 0

Related Questions