Reputation: 33
I'm writing a script in Python to transfer Excel Online data to GCP and I would like to replace \xa0 from strings inside column of DataFrame like '\xa0shopName' , '\xa0Street Adress', '\xa0'.
I've tried df = df.replace(u'\xa0', u'')
, but it's only replacing '\xa0', the strings with \xa0 and words stay the same. Maybe regex df = re.sub('#regular expression', '', df)
will help, but i cannot find correct regex sentence :/
Upvotes: 3
Views: 2420
Reputation: 627262
You can use
df = df.replace('\xa0', '', regex=True)
By passing the regex=True
option, you trigger re.sub
behind the scenes, that replaces all occurrences of non-breaking spaces with an empty string.
Upvotes: 4
Reputation: 19684
I believe you're running into an issue with how something is presented versus how it's represented. The hex a0 is decimal 160 and represented in a string as \xa0
. Do you have the string literal \xa0
or is the presentation showing you \xa0
? If its the former, you need to escape your backslash (here, I use a raw string instead):
df.replace(r"\xa0", "")
If the latter, your existing code should have worked:
df.replace("\xa0", "")
Upvotes: 0
Reputation: 5590
You can use just .strip
to remove that charter if it show on the beginning o end of your strings
>>> a='\xa0Street Adress'
>>> a[0]
'\xa0'
>>> a.strip()
'Street Adress'
Upvotes: 0