KayEss
KayEss

Reputation: 419

GeoPandas .sjoin large result table

I have two dataframes, both containing geometry columns. First dataframe contains POLYGONS while second one contains POINT. My aim is to join the dataframes so each POINT will be assigned to corresponding POLYGON. DF_polygons have 113704 rows and DF_points have 23223 rows.

I used this code to join dataframes (also tried 'within' and 'contains'):

points_in_polygons = gpd.sjoin(DF_points, DF_polygons, op='intersects')

The problem is, it returns me a result with approximately 3.000.000 rows in all cases.

What could be a problem?

Upvotes: 0

Views: 617

Answers (1)

Subrat Prasad
Subrat Prasad

Reputation: 183

Join operation is cross-product between DataFrames, where a row in results should satisfy the specified criteria (for sjoin it might be within, contain, intersects ). The use of gpd.sjoin(DF_points, DF_polygons, op='within') is correct for your case. The result is expected to produce a DataFrame with the number of rows <= 113704 X 23223.

Briefly, there must be more than one POLYGONS which covers a single POINT. So, for same point, you will observer multiple entries.

Upvotes: 1

Related Questions