Reputation: 419
I have two dataframes, both containing geometry columns. First dataframe contains POLYGONS
while second one contains POINT
. My aim is to join the dataframes so each POINT
will be assigned to corresponding POLYGON
. DF_polygons
have 113704
rows and DF_points
have 23223
rows.
I used this code to join dataframes (also tried 'within'
and 'contains'
):
points_in_polygons = gpd.sjoin(DF_points, DF_polygons, op='intersects')
The problem is, it returns me a result with approximately 3.000.000 rows
in all cases.
What could be a problem?
Upvotes: 0
Views: 617
Reputation: 183
Join operation is cross-product between DataFrames, where a row in results should satisfy the specified criteria (for sjoin
it might be within, contain, intersects ). The use of gpd.sjoin(DF_points, DF_polygons, op='within')
is correct for your case. The result is expected to produce a DataFrame with the number of rows <= 113704 X 23223
.
Briefly, there must be more than one POLYGONS which covers a single POINT. So, for same point, you will observer multiple entries.
Upvotes: 1