Reputation: 2457
I'm trying to read in a csv file through Pandas.
pd.read_csv('zip_mapping.gz',compression='gzip' ,header=None, sep=',')
But somehow I read in zip as float, like
0 501.0
1 1220.0
2 1509.0
3 1807.0
4 2047.0
as I don't know zip is in which column before I read in the data, so I could not set dtype in pd.read_csv.
I want to change zip into int, but due to missing values I got " could not convert NA to int "error.
Tried
str(zip).rstrip('0').rstrip('.')
But got this
'0 501.0\n1 1220.0\n2 1509.0\n3 1807.0\n4 2047.0\nName: zip, dtype: float64'
Actually I want convert zip in float into str like 501, 1220, 1509, 1807, 2047 then I could further padding leading zeros.
Any suggestion? Thank you.
Upvotes: 2
Views: 12807
Reputation: 214937
You can use Series.astype
method to convert float to int then to string, here I am using df
to refer to the data frame you read in from csv and df.zip
to refer to the zip column (adjust accordingly):
df.zip.astype(int).astype(str).str.zfill(5)
#0 00501
#1 01220
#2 01509
#3 01807
#4 02047
#Name: zip, dtype: object
If there is NA in the column, and you want to keep them as is:
df['zip'] = df.zip.dropna().astype(int).astype(str).str.zfill(5)
df
# zip
#0 NaN
#1 01220
#2 01509
#3 01807
#4 02047
Another option use string formatter:
df.zip.apply(lambda x: x if pd.isnull(x) else "{:05.0f}".format(x))
#0 NaN
#1 01220
#2 01509
#3 01807
#4 02047
#Name: zip, dtype: object
Upvotes: 3