Reputation: 37
I have a keys
list including words. When I make this command:
for key in keys:
print(key)
I get normal output in terminal.
but when I print the entire list using print(keys)
, I get this output:
I have tried using key.replace("\u202c", '')
, key.replace("\\u202c", '')
, re.sub(u'\u202c', '', key)
but none solved the problem.
I also tried the solutions here, but none of them worked either:
Replacing a unicode character in a string in Python 3
Removing unicode \u2026 like characters in a string in python2.7
Python removing extra special unicode characters
How can I remove non-ASCII characters but leave periods and spaces using Python?
I scraped this from Google Trends using Beautiful Soup and retrieved text from get_text()
Also in the page source of Google Trends Page, the words are listed as follows:
When I pasted the text here directly from the page source, the text pasted without these unusual symbols.
Upvotes: 1
Views: 3032
Reputation: 8769
You can just strip out the characters using strip
.
>>> keys=['\u202cABCD', '\u202cXYZ\u202c']
>>> for key in keys:
... print(key)
...
ABCD
XYZ
>>> newkeys=[key.strip('\u202c') for key in keys]
>>> print(keys)
['\u202cABCD', '\u202cXYZ\u202c']
>>> print(newkeys)
['ABCD', 'XYZ']
>>>
Tried 1 of your methods, it does work for me:
>>> keys
['\u202cABCD', '\u202cXYZ\u202c']
>>> newkeys=[]
>>> for key in keys:
... newkeys += [key.replace('\u202c', '')]
...
>>> newkeys
['ABCD', 'XYZ']
>>>
Upvotes: 2