Reputation: 55
Consider a following list: (I forgot to mention that my list also has numbers, int-s)
foo_list = [['foo', 100], ['\xa0foo', 200], ['foo\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0', 300], ['foo', 400]]
I've tried cleaning the list with the following function I found on SO when I was googling:
def remove_from_list(l, x):
new_list = [li.replace(x, u'') for li in l]
return new_list
foo_list_clean = remove_from_list(foo_list, u'\xa0')
This obviously gives me: (a new error)
AttributeError: 'int' object has no attribute 'replace'
Is it because it's a list of lists? How could I modify the code so that it'd work and remove the '\xa0
' character.
My expected output would be a new list with cleaned values from foo_list
.
Upvotes: 3
Views: 1043
Reputation: 1124070
Simply use str.strip()
on the first element, leaving the rest of the inner list intact:
[[inner[0].strip('\xa0')] + inner[1:] for inner in foo_list]
\xa0
is a non-breaking space, and provided your values are Unicode strings these will be stripped of without specifying an argument. Your sample input consists of bytestrings so I used an explicit strip:
>>> foo_list = [['foo', 100], ['\xa0foo', 200], ['foo\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0', 300], ['foo', 400]]
>>> [[inner[0].strip('\xa0')] + inner[1:] for inner in foo_list]
[['foo', 100], ['foo', 200], ['foo', 300], ['foo', 400]]
Your own approach would work fine too, but you need to use the function on slices of each nested list:
foo_list_clean = [remove_from_list(inner[:1], u'\xa0') + inner[1:] for inner in foo_list_clean]
However, using str.replace()
is not needed unless you have those \xa0
non-breaking spaces in between words; your sample only contains them at the starts and ends.
Note that if some elements are integers and others are strings, you'll have to do some duck typing:
[[s.strip('\xa0') if hasattr(s, 'strip') else s for s in inner]
for inner in foo_list]
Note that if your inputs are instead unicode
objects, you'll have to use a matching u'\xa0'
string to strip with! Alternatively, just use unicode.strip()
without arguments to remove all whitespace from the start and end (as \xa0
is U+00A0 NO-BREAK SPACE and is considered whitespace):
>>> foo_list = [[u'foo', 100], [u'\xa0foo', 200], [u'foo\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0', 300], [u'foo', 400]]
>>> [[inner[0].strip()] + inner[1:] for inner in foo_list]
[[u'foo', 100], [u'foo', 200], [u'foo', 300], [u'foo', 400]]
Upvotes: 2