Reputation: 1412
I have a pandas DataFrame
like this
name loc_x loc_y grp_name
a1 1.0 2.0 set1
a2 2.0 3.0 set1
a3 3.2 4.1 set2
a4 7.9 4.2 set2
I want to generate a GeoDataFrame
that generates a polygon
using loc_x
and loc_y
grouped on grp_name
and also includes a column name
that has the values in my original data frame concatenated by |
? The result should look like this
name geometry
set1 a1|a2 POLYGON ((1.0, 2.0)...)
set2 a3|a4 POLYGON ((3.2, 4.1)...)
I do this to get the geometry column but how do I also get an additional column with name
concatenated from my base data frame?
gdf = gpd.GeoDataFrame(geometry=df.groupby('grp_name').apply(
lambda g: Polygon(gpd.points_from_xy(g['loc_x'], g['loc_y']))))
Upvotes: 0
Views: 1144
Reputation: 31236
groupby().apply()
provides a reference to dataframe for each group. It's then simple to construct the two outputs you want per groupimport pandas as pd
import geopandas as gpd
import shapely.geometry
import io
df = pd.read_csv(io.StringIO("""name loc_x loc_y grp_name
a1 1.0 2.0 set1
a2 2.0 3.0 set1
a2.5 3.0 4.0 set1
a3 3.2 4.1 set2
a4 7.9 4.2 set2
a4.5 8.1 4.3 set2"""),sep="\s+",)
gpd.GeoDataFrame(
df.groupby("grp_name").apply(
lambda d: pd.Series(
{
"name": "|".join(d["name"].tolist()),
"geometry": shapely.geometry.Polygon(
d.loc[:, ["loc_x", "loc_y"]].values
),
}
)
)
)
Upvotes: 1