Kernel
Kernel

Reputation: 709

Why does geopandas dissolve function keep working forever?

All, I am trying to use the Geopandas dissolve function to aggregate a few countries; the function countries.dissolve keeps running forever. Here is a minimal script.

import geopandas as gpd

shape='/Volumes/TwoGb/shape/fwdshapfileoftheworld/'
countries=gpd.read_file(shape+'TM_WORLD_BORDERS-0.3.shp')


# Add columns
countries['wmosubregion'] = ''
countries['dummy'] = ''

country_count = len(countries)

# If the country list is empty then use all countries.
country_list=['SO','SD','DJ','KM']
default = 'Null'
for i in range(country_count):
     countries.at[i, 'wmosubregion'] = default
     if countries.ISO2[i] in country_list:
         countries.at[i, 'wmosubregion'] = "EAST_AFRICA"
         print(countries.ISO2[i])
         
         
region_shapes = countries.dissolve(by='wmosubregion')

I am using the TM_WORLD_BORDERS-0.3 shape files, which is freely accessible. You can get the shape files (TM_WORLD_BORDERS-0.3.shp, TM_WORLD_BORDERS-0.3.dbf, TM_WORLD_BORDERS-0.3.shx, TM_WORLD_BORDERS-0.3.shp ) from the following GitHub https://github.com/rmichnovicz/Sick-Slopes/tree/master

Thanks

Upvotes: 1

Views: 77

Answers (1)

Bera
Bera

Reputation: 1949

Dissolve is working when I try it, it finishes in a few seconds. My Geopandas version is 1.0.1.

import geopandas as gpd
df = gpd.read_file(r"C:\Users\bera\Downloads\TM_WORLD_BORDERS-0.3.shp")
df.plot(column="NAME")

enter image description here

df2 = df.dissolve()
df2.plot()

enter image description here

There are some invalid geometries that might cause problems for you? Try fixing them:

#df.geometry.is_valid.all()
#np.False_

#Four geometries are invalid
df.loc[~df.geometry.is_valid]
#     FIPS ISO2  ...     LAT                                           geometry
# 23    CA   CA  ...  59.081  MULTIPOLYGON (((-65.61362 43.42027, -65.61972 ...
# 32    CI   CL  ... -23.389  MULTIPOLYGON (((-67.21278 -55.89362, -67.24695...
# 154   NO   NO  ...  61.152  MULTIPOLYGON (((8.74361 58.40972, 8.73194 58.4...
# 174   RS   RU  ...  61.988  MULTIPOLYGON (((131.87329 42.95694, 131.82413 ...
# [4 rows x 12 columns]


df.geometry = df.geometry.make_valid()
#df.geometry.is_valid.all()
#np.True_

Upvotes: 2

Related Questions