Reputation: 1268
I'm writing a script to parse specific data from each outlook email.
I wrote something to strip out all carriage returns, new space, and white spaces from my string before parsing it, but it's very ugly. Any ideas on making it more elegant?
messageStr = messageStr.replace("\r","")
messageStr = messageStr.split('\n')
messageStr = [i for i in messageStr if i != '']
messageStr = [i for i in messageStr if i != ' ']
Upvotes: 0
Views: 86
Reputation: 12679
This task is related to data cleaning task , Here is my approach :
Put all symbols in a list and then just check if any symbol is in list then delete it.
dummy_string='Hello this is \n example \r to remove '' the special symbols ' ''
special_sym=['\r','\n','',' ']
[dummy_string.split().__delitem__(j) for j,i in enumerate(dummy_string.split()) if i in special_sym]
print(" ".join(dummy_string.split()))
output:
Hello this is example to remove the special symbols
P.S : you don't need '\r'
,'\n'
in special_sym list because when you do split()
it automatically removes those but still i showed there just for example.
Upvotes: 1
Reputation: 41051
The .strip
method of strings removes leading and trailing whitespace. If you wanted to get rid of the carriage returns on each line and other leading/trailing whitespace you could do this
lines = [line.strip() for line in message.split('\n')]
If you want to remove all whitespace, not just leading/trailing, you could do something similar against a string containing all whitespace you want to filter. The string
module has a helper for this. The following would remove all whitespace from a string s
:
import string
filtered_string = ''.join(char for char in s if char not in string.whitespace)
Upvotes: 1