Lee Jack
Lee Jack

Reputation: 191

how can I deal with the garbled string like '\xe7\xbe\x8e'?

I have a list of words like s = ['a','\xe7\xbe\x8e\xe7','b'], and I want to remove the members like '\xe7\xbe\x8e\xe7', but I cannot think of any useful method. I have never deal with such kind of encoded or decoded words. I wish any suggestion in python. Thanks!

Upvotes: 1

Views: 260

Answers (3)

Sohaib Farooqi
Sohaib Farooqi

Reputation: 5666

You can check if each word in a list is alphanumeric using isalnum function. If word is alphanumeric then keep it otherwise drop it. This can be achieved using list comprehension

>>> s = ['a','\xe7\xbe\x8e\xe7','b']
>>> [a for a in s if a.isalnum()]
>>> ['a', 'b']

Note: isalnum checks if string is alphanumeric i.e. contains letters and/or numbers. If you want to allow letters only then use isalpha instead

Upvotes: 1

Maxim
Maxim

Reputation: 34

Try this:

import itertools

s = ['a','\xe7\xbe\x8e\xe7','b']
for i in range(s.count("\xe7\xbe\x8e\xe7")):
    s.remove('\xe7\xbe\x8e\xe7')

Then all occurences of "\xe7\xbe\x8e\xe7" will be removed from the list.

Upvotes: 0

whackamadoodle3000
whackamadoodle3000

Reputation: 6748

def is_ascii(s):
    return all(ord(c) < 128 for c in s)
s=[e for e in s if is_ascii(e)]

Try this. It will remove entries with non-ascii characters (like \xe7\xbe\x8e\xe7). Hope this helps!

Upvotes: 1

Related Questions