Reputation: 402
I have dictionary
a = {'age': '12\xa0', 'name': 'pks\xa0\xa0'}
I wanted to remove all Non ASCII characters and replace with spaces.
For Removing Non ASCII character in non-dict we are using
''.join([i if 32 < ord(i) < 126 else " " for i in a])
But how to use for dictionary. Any help would be appreciated.
Upvotes: 3
Views: 1963
Reputation: 24555
Iteration on dictionary with map
can be used:
for k,v in a.items():
a[k] = "".join(map(lambda c: c if 32<ord(c)<127 else " " , v))
print(a)
give following output:
{'name': 'pks ', 'age': '12 '}
Upvotes: 0
Reputation: 107297
You don't need a list comprehension and ord
just encode to ascii and ignore the errors:
In [106]: {key:value.encode('ascii',errors='ignore') for key, value in a.items()}
Out[106]: {'age': b'12', 'name': b'pks'}
If you want to replace with space here is an efficient way:
In [117]: def replace_nonascii(mydict):
for key, value in a.items():
new = value.encode('ascii',errors='ignore')
yield key, new + b' ' * (len(value) - len(new))
.....:
In [118]: dict(replace_nonascii(a))
Out[118]: {'age': b'12 ', 'name': b'pks '}
Upvotes: 4
Reputation: 402563
Building on the answer from this question, you can use re.sub
, removing non-ASCII characters and replacing them with a space.
>>> import re
>>> {k : re.sub(r'[^\x00-\x7F]',' ', v) for k, v in a.items()}
{'age': '12 ', 'name': 'pks '}
This should work on python-3.x (python) as well as python-2.x (pythoff).
Upvotes: 2
Reputation: 36682
You can remove the non printable ascii chars like this; it applies the line of code you provided to replace non printable ascii by a white space, to each value in the dictionary:
def remove_non_printable_ascii(s):
return ''.join([c if 32 < ord(c) < 127 else " " for c in s])
a = {'age': '12\xa0', 'name': 'pks\xa0\xa0'}
for k in a:
a[k] = remove_non_printable_ascii(a[k])
a
output:
{'age': '12 ', 'name': 'pks '}
Upvotes: 2