mmTmmR
mmTmmR

Reputation: 573

Geopandas Overlay (Intersection) Returns Zero Rows

I created a single row GeoDataFrame with a Points geometry column as follows:

1. Create df

df = pd.DataFrame([[51.502687, -3.538329, 2242, 1, 47]],
                   columns = ["lat","long","num_of_trucks","performance","num_of_routes"]
                 )

df 

2. Create gdf from df

gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df["lat"],df["long"]),
                                                                 crs={"init": "epsg:4326"})
gdf

3. Reproject to CRS that uses metres

gdf.to_crs(epsg=3395,inplace=True)

4. Calculate 10KM Buffer Radius around Point and replace original geometry column for the Point with that for the new buffer radius Polygon

gdf["geometry"] = gdf.geometry.buffer(10000)

The above process produced my desired single row GeoDataFrame gdf with a new geometry column that contains a single Polygon.

I have a second GeoDataFrame called map_selection. This 2nd GeoDataFrame is made up of a few rows from this shapefile:https://geoportal.statistics.gov.uk/datasets/local-authority-districts-may-2020-boundaries-uk-buc. Each row of this GeoDataFrame represents a UK Local Authority District (as a Polygon/Multigon in its geometry column). Please note that this shapefile initially used CRS = EPSG 27700 but I changed this to CRS = ESPG 4326 when I loaded it as I needed to plot the Polygons in Geoviews.

Consider the following picture:

enter image description here

The Polygon contained in the newly produced gdf geometry column is the red circle. Each of the blue Polygons represent the Local Authority District Polygons/Multigons contained in the map_selection geometry column.

I am trying to work out the area of each blue Polygon in map_selection that add together to make up the single red Polygon in gdf. Note that the red Polygon does not intersect all Blue Polygons but definitely intersects some. My ultimate aim is to estimate the population of the red Polygon based on the populations of the appropriate blue Polygons.

I thought that the following would get me the intersections between the Polygons, however instead it returns zero rows:

5. Reproject gdf CRS to match that of map_selection:

gdf.to_crs(epsg=4326,inplace=True)

6. Return intersection between gdf and map_selection GeoDataFrames:

gpd.overlay(map_selection,gdf,how="intersection")

Output was zero rows, as was the output for:

gpd.overlay(gdf,map_selection,how="intersection")

Any thoughts or suggestions as to how I could get this working would be greatly appreciated.

Thanks

Upvotes: 1

Views: 1417

Answers (2)

mmTmmR
mmTmmR

Reputation: 573

So I plotted my 2 GeoDataFrames as suggested by @martinfleis and this showed me that my single Polygon's latitude and longitude values were inverted. Strangely they didn't appear this way when you looked at the row of the GeoDataFrame, only when plotted, which is why I didn't pick it up.

It turned out that Step 2 of my code should have been:

gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df["long"],df["lat"]),
                                                                 crs={"init": "epsg:4326"})

and not

gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df["lat"],df["long"]),
                                                                 crs={"init": "epsg:4326"})

To reiterate - if you look at your gdf GeoDataFrame after running either of the above, the latitude and longitude values will be the same for both cases. It is only when I plotted the overlay of these 2 GeoDataFrames in Geoviews did the inversion of latitude and longitude become apparent.

My overlay now successfully returns the expected intersection rows.

Upvotes: 1

martinfleis
martinfleis

Reputation: 7804

You have different projections most likely. You admin boundaries are in EPSG:27700.

To make sure that you are re-projecting to the CRS of other dataframe, always pass that CRS directly.

gdf.to_crs(map_selection.crs, inplace=True)

Upvotes: 0

Related Questions