winecity
winecity

Reputation: 285

Convert Point Geometries to Linestrings with GeoPandas

I have a geodataframe gdf that looks like this:

        longitude   latitude    geometry
8628    4.890683    52.372383   POINT (4.89068 52.37238)
8629    4.890500    52.371433   POINT (4.89050 52.37143)
8630    4.889217    52.369469   POINT (4.88922 52.36947)
8631    4.889300    52.369415   POINT (4.88930 52.36942)
8632    4.889100    52.368683   POINT (4.88910 52.36868)
8633    4.889567    52.367416   POINT (4.88957 52.36742)
8634    4.889333    52.367134   POINT (4.88933 52.36713)

I was trying to convert these point geometries into a line. However, the following code below gives an error: AttributeError: 'Point' object has no attribute 'values'

line_gdf = gdf['geometry'].apply(lambda x: LineString(x.values.tolist()))
line_gdf = gpd.GeoDataFrame(line_gdf, geometry='geometry')

Any ideas ?

Upvotes: 5

Views: 5403

Answers (2)

swatchai
swatchai

Reputation: 18762

When you create a LineString from all Points in a geodataframe, you get only 1 line. Here is the code you can run to create the LineString:

from shapely.geometry import LineString

# only relevant code here
# use your gdf that has Point geometry
lineStringObj = LineString( [[a.x, a.y] for a in gdf.geometry.values] )

If you need a geodataframe of 1 row with this linestring as its geometry, proceed with this:

import pandas as pd
import geopandas as gpd

line_df = pd.DataFrame()
line_df['Attrib'] = [1,]
line_gdf = gpd.GeoDataFrame(line_df, geometry=[lineStringObj,])

Edit1

Pandas has powerful aggregate function that can be used to collect all the coordinates (longitude, latitude) for use by LineString() to create the required geometry.

I offer this runnable code that demonstrates such approach for the benefit of the readers.

import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString
from shapely import wkt
from io import StringIO
import numpy as np

# Create a dataframe from CSV data
df5 = pd.read_csv(StringIO(
"""id longitude latitude
8628  4.890683  52.372383
8629  4.890500  52.371433
8630  4.889217  52.369469
8631  4.889300  52.369415
8632  4.889100  52.368683
8633  4.889567  52.367416
8634  4.889333  52.367134"""), sep="\s+")

# Using pandas' aggregate function
# Aggregate longitude and latitude
stack_lonlat = df5.agg({'longitude': np.stack, 'latitude':  np.stack})
# Create the LineString using aggregate values
lineStringObj = LineString(list(zip(*stack_lonlat)))

# (Previously use) Create a lineString from dataframe values
#lineStringObj = LineString( list(zip(df5.longitude.tolist(), df5.latitude.tolist())) )
# Another approach by @Phisan Santitamnont may be the best.

# Create a geodataframe `line_gdf` for the lineStringObj
# This has single row, containing the linestring created from aggregation of (long,lat) data
df6 = pd.DataFrame()
df6['LineID'] = [101,]
line_gdf = gpd.GeoDataFrame(df6, crs='epsg:4326', geometry=[lineStringObj,])

# Plot the lineString in red
ax1 = line_gdf.plot(color="red", figsize=[4,10]);
# Plot the original data: "longitude", "latitude" as kind="scatter"
df5.plot("longitude", "latitude", kind="scatter", ax=ax1);

output

Upvotes: 6

Phisan Santitamnont
Phisan Santitamnont

Reputation: 23

Sir, as of 2022 , i would like to propose another updated pythonic style ....

# Create a dataframe from CSV data
df = pd.read_csv(StringIO(
"""id longitude latitude
8628  4.890683  52.372383
8629  4.890500  52.371433
8630  4.889217  52.369469
8631  4.889300  52.369415
8632  4.889100  52.368683
8633  4.889567  52.367416
8634  4.889333  52.367134"""), sep="\s+")

ls = LineString( df[['longitude','latitude']].to_numpy() )
line_gdf = gpd.GeoDataFrame( [['101']],crs='epsg:4326', geometry=[ls] )

# Plot the lineString in red
ax = line_gdf.plot(color="red", figsize=[4,10]);
df.plot("longitude", "latitude", kind="scatter", ax=ax);
plt.show()

Upvotes: 2

Related Questions