Reputation: 53
I have a pandas dataframe with the column ZIP which contains lots of ZIP Codes.
However, I get the source data in an .xlsx-file. In Excel this column is a number field. That's why Excel ignores the 0 as the first character for ZIP codes which typically start with a zero.
Normally a ZIP code in my country has five digits.
Now, I want to refill the zero in my dataframe column "ZIP" again. The rows with the missing zero's are easy to spot because they have 4 characters instead of 5.
2314
1234
1234
4322
1234
1234
4323
97862
43242
43212
43213
How can I add a '0' to all rows in the column "ZIP" where there are only for characters instead of the normal five?
I know that I could get a leading zero like this:
missing_zero = "0"
df['ZIP'] = missing_zero + df['ZIP'].map(str)
But this would add a zero to all the rows instead of just the rows with the missing zero.
I'd appreciate any insights on how to add the where character length = 4 logic
Thanks in advance!
Upvotes: 0
Views: 46
Reputation: 8768
Try this if your column is formatted as a string.
df['zip'].str.zfill(5)
Upvotes: 2
Reputation: 861
Does something like this work for you?
df['ZIP'] = df['ZIP'].apply(lambda x: '0' + str(x) if len(str(x)) == 4 else str(x))
print(df)
ZIP
0 02315
1 01234
2 01234
3 12345
4 12345
Upvotes: 1