Reputation: 11
I have a geodataframe containing columns of :-
and another geodataframe called "boudaries" which containing the geometry of boundaries
I want to create another column in boundaries geodataframe which calculate the sum of mobile subscription based on the latitude and longitude that falls on the boundaries in the boundary dataframe.
I really hope someone can help me in this issue. Appreciate your kind assistance.
I have tried to merge both data frames, but I have no idea on how to calculate the data based on the boundaries
Upvotes: 0
Views: 72
Reputation: 539
This answer outputs the num of subscription given a specific area:
import geopandas as gpd
import pandas as pd
# creating a dummy boundary geodataframe
df = pd.DataFrame({'name': ['first boundary', 'second boundary'],
'area': ['POLYGON ((-10 -3, -10 3, 3 3, 3 -10, -10 -3))', 'POLYGON ((-20 -21, -12 -17, 2 -15, 5 -20, -20 -21))']})
boundaries = gpd.GeoDataFrame(df[['name']], geometry=gpd.GeoSeries.from_wkt(df.area, crs = 'epsg:4326'))
# creating a dummy geodataframe with some points (you can change it to your coordenates)
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'coordenates': ['POINT(-7 1)', 'POINT(1 -2)', 'POINT(-17 -20)', 'POINT(0 -18)', 'POINT(-5 0)']})
subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))
# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())
If you have the X and Y cordenates in diferent columns (named X and Y in this example), you can do as folows:
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'X': [-7, 1, -17, 0, -5],
'Y': [1, -2, -20, -18, 0]})
# Converting the x and y columns to geometry points
points['coordenates'] = points[['X', 'Y']].apply(lambda x: 'POINT('+str(x.X)+' '+str(x.Y)+')', axis=1)
# creating the geopandas dataframe
subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))
# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())
Hope it works for you.
Upvotes: 0