Carlos Chavita
Carlos Chavita

Reputation: 110

How to know if a coordinate is inside a polygon of coordinates

It turns out that I need to validate if a coordinate is inside a polygon, and if the coordinate is inside the polygon, it returns a column with the attribute 'SCANOMBRE', which contains the name of each polygon, already explaining the problem now I have the following data in a geojson file where the polygons are.

dataset of poligonos

this dataset contains the following columns

barrios = gpd.GeoDataFrame.from_file('dataset/scat_geojson.geojson')
print(barrios.SCANOMBRE, barrios.geometry)

the above data is what interests me punctually

enter image description here

and now I have a file with coordinate points that I need to validate if it belongs to a polygon from the list above.

dataset off points

This dataset contains in the Latitude and Longitude column

pdv = pd.read_excel('dataset/Matriz_Final.xlsx')
pdv.columns = pdv.columns.str.strip()
print(pdv.Latitud, pdv.Longitud)

enter image description here

Now comes my question, how do I know if each of these coordinate points is inside a polygon, and how does it return a dictionary with the coordinates and the SNAME variable, something similar to the following example.

enter image description here

With the validation that is carried out, I want you to give me the information in this way so that I can later graph it, I hope you can help me, thank you very much.

Upvotes: 0

Views: 1640

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31146

  • have used countries as polygons / multipolygons (barrios)
  • have generated some points, some will fall within countries (puntos)
  • simple case of left sjoin() tells you which points are in a polygon and which are not
  • have visualised to demonstrate it works (green valid points, purple invalid)
import geopandas as gpd
import pandas as pd
import numpy as np

barrios = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))

# generate some points, some are valid some invalid
puntos = pd.DataFrame(
    {
        "Latitud": np.linspace(*barrios.total_bounds[[0, 2]], 100),
        "Longitud": np.linspace(*barrios.total_bounds[[1, 3]], 100),
    }
)

# find valid points by do an sjoin
valid = gpd.GeoDataFrame(puntos,
    geometry=gpd.points_from_xy(puntos["Latitud"], puntos["Longitud"]), crs="epsg:4326"
).sjoin(barrios.loc[:, ["geometry"]], how="left").assign(
    valid=lambda d: (~d["index_right"].isna()).astype(int)
)

sample output

Latitud Longitud geometry index_right valid
0 -180 -90 POINT (-180 -90) nan 0
1 -176.364 -88.246 POINT (-176.3636363636364 -88.24600878787879) 159 1
2 -172.727 -86.492 POINT (-172.7272727272727 -86.49201757575757) 159 1
3 -169.091 -84.738 POINT (-169.0909090909091 -84.73802636363637) 159 1
4 -165.455 -82.984 POINT (-165.4545454545454 -82.98403515151516) nan 0
5 -161.818 -81.23 POINT (-161.8181818181818 -81.23004393939394) nan 0
6 -158.182 -79.4761 POINT (-158.1818181818182 -79.47605272727273) nan 0
7 -154.545 -77.7221 POINT (-154.5454545454545 -77.72206151515152) 159 1
8 -150.909 -75.9681 POINT (-150.9090909090909 -75.9680703030303) nan 0
9 -147.273 -74.2141 POINT (-147.2727272727273 -74.2140790909091) nan 0

visualize validation

m = barrios.explore()
valid.explore(m=m, column="valid")

enter image description here

Upvotes: 3

Related Questions