Reputation: 3375
I am using the python module arcgis
to download a shapefile, located here: https://www.arcgis.com/home/item.html?id=2d5c785555aa4b0b946f1aa61c56274f
I have managed to extract it into a pandas dataframe by following the documentation:
from arcgis import GIS
import pandas as pd
gis = GIS(verify_cert=False,api_key=your_key)
# search for file by name which is National_LHO
search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")
# get layer
layer = search_result[0].layers[0]
# dataframe from layer
df= pd.DataFrame.spatial.from_layer(layer)
# check it out
print(df.head())
FID LHO Shape__Area Shape__Length \
0 1 Carlow/Kilkenny 7.053876e+09 5.924032e+05
1 2 Cavan/Monaghan 8.858580e+09 7.801971e+05
2 3 Clare 9.055446e+09 8.005301e+05
3 4 Donegal 1.467971e+10 2.135710e+06
4 5 Dublin North 1.076876e+09 3.327819e+05
SHAPE
0 {"rings": [[[-747212.35980769, 6967909.5066712...
1 {"rings": [[[-781459.713316924, 7249668.124932...
2 {"rings": [[[-1083308.07544972, 6918940.329570...
3 {"rings": [[[-912809.697847961, 7265617.367554...
4 {"rings": [[[-674539.041086896, 7057323.867987...
I have then converted it to a geopandas dataframe:
import geopandas as gpd
gdf = gpd.GeoDataFrame(dc_df)
However the CRS info looks strange. It says the bounds are (-180.0, -90.0, 180.0, 90.0)
but the geometry coordinates go much higher than that.
print(gdf.crs)
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
print(gdf.head())
FID LHO Shape__Area Shape__Length \
0 1 Carlow/Kilkenny 7.053876e+09 5.924032e+05
1 2 Cavan/Monaghan 8.858580e+09 7.801971e+05
2 3 Clare 9.055446e+09 8.005301e+05
3 4 Donegal 1.467971e+10 2.135710e+06
4 5 Dublin North 1.076876e+09 3.327819e+05
geometry
0 MULTIPOLYGON (((-747212.360 6967909.507, -7459...
1 MULTIPOLYGON (((-781459.713 7249668.125, -7813...
2 MULTIPOLYGON (((-1083308.075 6918940.330, -108...
3 MULTIPOLYGON (((-912809.698 7265617.368, -9127...
4 MULTIPOLYGON (((-674539.041 7057323.868, -6748...
I have been told that simply using to_crs('EPSG:4326')
will convert everything to latitude and longitude, but that doesn't work.
gdf.to_crs('EPSG:4326').head()
As you can see they are still in xy coordinates, and not long / lat.
I cannot figure this out. I just want longitude and latitude polygons.
Does any body know know to do this? I am losing my mind and have spent days on this. I am not a geopandas expert.
Perhaps some useful info. The data I downloaded has this spatial reference: Spatial Reference: 102100 (3857)
I tried to use 3857
while converting to a geodataframe, but got an error.
gdf = gpd.GeoDataFrame(dc_df,crs='3857')
FutureWarning: CRS mismatch between CRS of the passed geometries and 'crs'. Use 'GeoDataFrame.set_crs(crs, allow_override=True)' to overwrite CRS or 'GeoDataFrame.to_crs(crs)' to reproject geometries. CRS mismatch will raise an error in the future versions of GeoPandas.
Upvotes: 0
Views: 1966
Reputation: 256
It wasn’t converting because when you read the normal pandas dataframe into geopandas dataframe, the crs information was lost
from arcgis import GIS
import pandas as pd
gis = GIS(verify_cert=False)
# search for file by name which is National_LHO
search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")
# get layer
layer = search_result[0].layers[0]
# dataframe from layer
df= pd.DataFrame.spatial.from_layer(layer)
# check it out
# print(df.head())
import geopandas as gpd
gdf = gpd.GeoDataFrame(df)
If you do gdf.crs
it returns empty.
So the crs information has to be set manually. Setting it to epsg 3857 as you have mentioned it
gdf = gdf.set_geometry('SHAPE')
gdf = gdf.set_crs(epsg='3857')
and now you can do
gdf.to_crs(epsg='4326').head()
Upvotes: 2