Antonio Ramos Rivera
Antonio Ramos Rivera

Reputation: 43

Geopandas Export Changing Data Type

HI I am having trouble exporting a join, between a shapefile and csv, to a shapefile using geopandas in python 3.7. My code is below. When I run it, it saves a shapefile, however, when I try to work with the data in the shapefile it seems as though the data type for various columns is being changed from int to something else. How can I rigidify the data type of 'Manufa Emp' so that when I export it, the datatype remains as an int?

import sys
import pandas as pd
import geopandas as gpd
import numpy

# Set Working Directory
sys.path.append(r"/Users/antonioramos/Desktop/Buzard_Research_Program")

# Read in gz.csv file as "ZCTA" Table
emp = r"/Users/antonioramos/Desktop/Buzard_Research_Program/DEC_00_SF3_DP3_with_ann.csv"
Table = pd.read_csv(emp, skiprows = 1)
                
# Create new table "ZCTA_Manufa" with only Block ID and Total employment columns
Tab2 = Table.loc[:,["Id2", "Number; Employed civilian population 16 years and over - 
INDUSTRY - Manufacturing"]].values

# renaming headers
Tab2 = pd.DataFrame(data=Tab2, columns=["ZCTA5CE00", "Manufa_Emp"])

# Import Shapefile
zips = r"/Users/antonioramos/Desktop/Buzard_Research_Program/tl_2010_06_zcta500.shp"
data = gpd.read_file(zips)

# To join the two together
Table3 = data.merge(Tab2, on='ZCTA5CE00')

zFeatures = Table3.filter(['Manufa_Emp', 'ZCTA5CE00', 'geometry'], axis = 1)
zFeatures['Manufa_Emp'].astype(int)


# Set geometry and CRS
geometry = zFeatures.geometry
geo_df = gpd.GeoDataFrame(Table3, geometry = geometry)
geo_df = geo_df.to_crs('epsg:5070') 

sum(zFeatures['Manufa_Emp'])


# Export out as a shapefile
result = ("CA_ZCTA_Man6.shp")
geo_df.to_file(result)

Upvotes: 1

Views: 1151

Answers (1)

This is a solution that I have been using for that same problem. I have shown below the solution for you to make a 'Long integer' or 'Short integer' for ESRI shapefiles. These can be read by either ArcGIS Pro or Arcmap 10.x:

## For 'Short integer' format
schema = gpd.io.file.infer_schema(geo_df)
schema['properties']['Manufa_Emp'] = 'int32:4'
geo_df.to_file(DDN_shp_name, schema=schema)


## For 'Long integer' format
schema = gpd.io.file.infer_schema(geo_df)
schema['properties']['Manufa_Emp'] = 'int32:10'
geo_df.to_file(DDN_shp_name, schema=schema)

I still cannot figure out how to avoid some of the field changes that geopandas implements when to_file is used, but this should help with integers.

Upvotes: 2

Related Questions