AzFlin
AzFlin

Reputation: 2030

Pandas Column Value Replacement confusing behaviour.

I have a dataframe, 'metrospot', with a Postal Code column and I want to remove the space in postal code. The below does not alter the dataframe:

metrospot['Postal Code'] = metrospot['Postal Code'].replace(" ","")

But the below code will:

metrospot['Postal Code'] = metrospot['Postal Code'] + "foo"

I had to resort to butchery like this to proceed:

for i,j in zip(metrospot['Postal Code'],range(len(metrospot))):
    i = i.replace(" ","")
    metrospot.loc[j,'Postal Code']=i

What is the correct way to do this and why does the above behaviour happen? Thank you.

Upvotes: 0

Views: 52

Answers (1)

BrenBarn
BrenBarn

Reputation: 251448

You are calling replace on the Series object. This is not the string replace method but a pandas method that replaces entire values. So if any of the values in your column were " " (i.e., a cell in the DataFrame contained just a single space and that's it), it would be replaced with an empty string.

If you want to use string replacement, use the str attribute:

metrospot['Postal Code'] = metrospot['Postal Code'].str.replace(" ","")

Upvotes: 2

Related Questions