Reputation: 458
I have this dataframe
d = {
'geoid': ['13085970205'],
'FIPS': ['13085'],
'Year': [2024],
'parameters': [{"Year": 2024, "hpi_prediction": 304.32205}],
'geometry':[
{
"coordinates": [[[[-84.126456, 34.389734], [-84.12641, 34.39026], [-84.126323, 34.39068]]]],
"parameters": {"Year": 2024, "hpi_prediction": 304.32205},
"type": "MultiPolygon"
}
]
}
dd = pd.DataFrame(data=d)
When I want to write this out I use import geopandas as gpd
to convert the data into a dataframe like this
df_geopandas_hpi = gpd.GeoDataFrame(dd[['geoid', 'geometry']])
Once this happens the parameters
key in the original dataframe gets erased. Why? Note that the type of geometry in example dataframe is geojson.geometry.MultiPolygon
. How can I avoid this from happening?
What I essentially need to do is the following
if ~os.path.exists('../verus_data'):
os.mkdir('../verus_data')
for county, df_county in dd.groupby('FIPS'):
if ~os.path.exists('../verus_data/'+str(county)):
os.mkdir('../verus_data/'+str(county))
if ~os.path.exists('../verus_data/'+str(county)+'/'+'predicted'):
os.mkdir('../verus_data/'+str(county)+'/'+'predicted')
if ~os.path.exists('../verus_data/'+str(county)+'/'+'analyzed'):
os.mkdir('../verus_data/'+str(county)+'/'+'analyzed')
df_hpi = df_county[df_county['key'] == 'hpi']
df_analyzed = df_county[df_county['key'] == 'analyzed']
for year, df_year in df_hpi.groupby('Year'):
if ~os.path.exists('../verus_data/'+str(county)+'/'+'predicted'+'/'+str(year)):
os.mkdir('../verus_data/'+str(county)+'/'+'predicted'+'/'+str(year))
df_geopandas_hpi = gpd.GeoDataFrame(df_year[['geoid', 'geometry', 'parameters']])
df_geopandas_hpi.to_file('../verus_data/'+str(county)+'/'+'predicted'+'/'+str(year)+'/'+'hpi_predictions.geojson', driver="GeoJSON")
for year, df_year in df_analyzed.groupby('Year'):
if ~os.path.exists('../verus_data/'+str(county)+'/'+'analyzed'+'/'+str(year)):
os.mkdir('../verus_data/'+str(county)+'/'+'analyzed'+'/'+str(year))
df_geopandas_analyzed = gpd.GeoDataFrame(df_year[['geoid', 'geometry', 'parameters']])
df_geopandas_analyzed.to_file('../verus_data/'+str(county)+'/'+'analyzed'+'/'+str(year)+'/'+'analyzed_values.geojson', driver="GeoJSON")
I need to somehow write out these geojson files while keeping parameters key intact.
Upvotes: 0
Views: 54
Reputation: 15452
Geopandas relies on the shapely
library to handle geometry objects. Shapely does not have a concept of parameters or additional metadata which can be included at arbitrary levels in GeoJSON but don't fit the shapely or geopandas data models.
For example, when parsing with shapely.geometry.shape
:
In [10]: shape = shapely.geometry.shape(
...: {
...: "coordinates": [[[[-84.126456, 34.389734], [-84.12641, 34.39026], [-84.126323, 34.39068]]]],
...: "parameters": {"Year": 2024, "hpi_prediction": 304.32205},
...: "type": "MultiPolygon"
...: }
...: )
In [11]: shape
Out[11]: <shapely.geometry.multipolygon.MultiPolygon at 0x11040eb60>
In [12]: shape.parameters
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 shape.parameters
AttributeError: 'MultiPolygon' object has no attribute 'parameters'
If you'd like to retain these, you'll need to parse the json separately from converting to geopandas. For example, if "parameters" is present in every element, you could simply assign it as a new column:
In [21]: gdf = gpd.GeoDataFrame(dd[["geoid", "geometry"]])
...: gdf["parameters"] = dd.geometry.str["parameters"]
In [22]: gdf
Out[22]:
geoid geometry parameters
0 13085970205 {'coordinates': [[[[-84.126456, 34.389734], [-... {'Year': 2024, 'hpi_prediction': 304.32205}
However, if the parameters field is not always present, you may need to do some extra cleaning. You can always access the elements of the geometry column within the pandas dataframe dd
directly, e.g.
In [27]: dd.loc[0, "geometry"]["parameters"]["hpi_prediction"]
Out[27]: 304.32205
Upvotes: 1
Reputation: 458
All you have to do is add the parameters column in the
df_geopandas_hpi = gpd.GeoDataFrame(df_year[['geoid', 'geometry', 'parameters']])
Upvotes: 0