Reputation: 285
I have a geodataframe gdf
that looks like this:
longitude latitude geometry
8628 4.890683 52.372383 POINT (4.89068 52.37238)
8629 4.890500 52.371433 POINT (4.89050 52.37143)
8630 4.889217 52.369469 POINT (4.88922 52.36947)
8631 4.889300 52.369415 POINT (4.88930 52.36942)
8632 4.889100 52.368683 POINT (4.88910 52.36868)
8633 4.889567 52.367416 POINT (4.88957 52.36742)
8634 4.889333 52.367134 POINT (4.88933 52.36713)
I was trying to convert these point geometries into a line. However, the following code below gives an error: AttributeError: 'Point' object has no attribute 'values'
line_gdf = gdf['geometry'].apply(lambda x: LineString(x.values.tolist()))
line_gdf = gpd.GeoDataFrame(line_gdf, geometry='geometry')
Any ideas ?
Upvotes: 5
Views: 5403
Reputation: 18762
When you create a LineString
from all Points in a geodataframe, you get only 1 line. Here is the code you can run to create the LineString:
from shapely.geometry import LineString
# only relevant code here
# use your gdf that has Point geometry
lineStringObj = LineString( [[a.x, a.y] for a in gdf.geometry.values] )
If you need a geodataframe of 1 row with this linestring as its geometry, proceed with this:
import pandas as pd
import geopandas as gpd
line_df = pd.DataFrame()
line_df['Attrib'] = [1,]
line_gdf = gpd.GeoDataFrame(line_df, geometry=[lineStringObj,])
Edit1
Pandas
has powerful aggregate function that can be used to collect all the coordinates (longitude, latitude) for use by LineString()
to create the required geometry.
I offer this runnable code that demonstrates such approach for the benefit of the readers.
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
from shapely import wkt
from io import StringIO
import numpy as np
# Create a dataframe from CSV data
df5 = pd.read_csv(StringIO(
"""id longitude latitude
8628 4.890683 52.372383
8629 4.890500 52.371433
8630 4.889217 52.369469
8631 4.889300 52.369415
8632 4.889100 52.368683
8633 4.889567 52.367416
8634 4.889333 52.367134"""), sep="\s+")
# Using pandas' aggregate function
# Aggregate longitude and latitude
stack_lonlat = df5.agg({'longitude': np.stack, 'latitude': np.stack})
# Create the LineString using aggregate values
lineStringObj = LineString(list(zip(*stack_lonlat)))
# (Previously use) Create a lineString from dataframe values
#lineStringObj = LineString( list(zip(df5.longitude.tolist(), df5.latitude.tolist())) )
# Another approach by @Phisan Santitamnont may be the best.
# Create a geodataframe `line_gdf` for the lineStringObj
# This has single row, containing the linestring created from aggregation of (long,lat) data
df6 = pd.DataFrame()
df6['LineID'] = [101,]
line_gdf = gpd.GeoDataFrame(df6, crs='epsg:4326', geometry=[lineStringObj,])
# Plot the lineString in red
ax1 = line_gdf.plot(color="red", figsize=[4,10]);
# Plot the original data: "longitude", "latitude" as kind="scatter"
df5.plot("longitude", "latitude", kind="scatter", ax=ax1);
Upvotes: 6
Reputation: 23
Sir, as of 2022 , i would like to propose another updated pythonic style ....
# Create a dataframe from CSV data
df = pd.read_csv(StringIO(
"""id longitude latitude
8628 4.890683 52.372383
8629 4.890500 52.371433
8630 4.889217 52.369469
8631 4.889300 52.369415
8632 4.889100 52.368683
8633 4.889567 52.367416
8634 4.889333 52.367134"""), sep="\s+")
ls = LineString( df[['longitude','latitude']].to_numpy() )
line_gdf = gpd.GeoDataFrame( [['101']],crs='epsg:4326', geometry=[ls] )
# Plot the lineString in red
ax = line_gdf.plot(color="red", figsize=[4,10]);
df.plot("longitude", "latitude", kind="scatter", ax=ax);
plt.show()
Upvotes: 2