Reputation: 43
HI I am having trouble exporting a join, between a shapefile and csv, to a shapefile using geopandas in python 3.7. My code is below. When I run it, it saves a shapefile, however, when I try to work with the data in the shapefile it seems as though the data type for various columns is being changed from int to something else. How can I rigidify the data type of 'Manufa Emp' so that when I export it, the datatype remains as an int?
import sys
import pandas as pd
import geopandas as gpd
import numpy
# Set Working Directory
sys.path.append(r"/Users/antonioramos/Desktop/Buzard_Research_Program")
# Read in gz.csv file as "ZCTA" Table
emp = r"/Users/antonioramos/Desktop/Buzard_Research_Program/DEC_00_SF3_DP3_with_ann.csv"
Table = pd.read_csv(emp, skiprows = 1)
# Create new table "ZCTA_Manufa" with only Block ID and Total employment columns
Tab2 = Table.loc[:,["Id2", "Number; Employed civilian population 16 years and over -
INDUSTRY - Manufacturing"]].values
# renaming headers
Tab2 = pd.DataFrame(data=Tab2, columns=["ZCTA5CE00", "Manufa_Emp"])
# Import Shapefile
zips = r"/Users/antonioramos/Desktop/Buzard_Research_Program/tl_2010_06_zcta500.shp"
data = gpd.read_file(zips)
# To join the two together
Table3 = data.merge(Tab2, on='ZCTA5CE00')
zFeatures = Table3.filter(['Manufa_Emp', 'ZCTA5CE00', 'geometry'], axis = 1)
zFeatures['Manufa_Emp'].astype(int)
# Set geometry and CRS
geometry = zFeatures.geometry
geo_df = gpd.GeoDataFrame(Table3, geometry = geometry)
geo_df = geo_df.to_crs('epsg:5070')
sum(zFeatures['Manufa_Emp'])
# Export out as a shapefile
result = ("CA_ZCTA_Man6.shp")
geo_df.to_file(result)
Upvotes: 1
Views: 1151
Reputation: 23
This is a solution that I have been using for that same problem. I have shown below the solution for you to make a 'Long integer' or 'Short integer' for ESRI shapefiles. These can be read by either ArcGIS Pro or Arcmap 10.x:
## For 'Short integer' format
schema = gpd.io.file.infer_schema(geo_df)
schema['properties']['Manufa_Emp'] = 'int32:4'
geo_df.to_file(DDN_shp_name, schema=schema)
## For 'Long integer' format
schema = gpd.io.file.infer_schema(geo_df)
schema['properties']['Manufa_Emp'] = 'int32:10'
geo_df.to_file(DDN_shp_name, schema=schema)
I still cannot figure out how to avoid some of the field changes that geopandas implements when to_file is used, but this should help with integers.
Upvotes: 2