Reputation: 1
I am currently working with email data and when extracting from Outlook, the body of the email still keeps all of the escape characters within the string.
I'm using the re
package in Python to achieve this, but to no avail.
Here's an example of text I'm trying to rid the escape characters from:
I am completely in agreement with that. \r\n\r\n\rbest regards.
Expected:
I'd like this to read: "I am completely in agreement with that. best regards.
I've tried the following to extract the unwanted text:
re.findall(r'\\\w+', string)
re.findall(r'\\*\w+', string)
re.findall(r'\\[a-z]+', string)
None of these are doing the trick. I'd appreciate any help!
Thanks!
Upvotes: 0
Views: 215
Reputation: 1721
you can try this:
re.sub(r'\n|\r','', string)
'I am completely in agreement with that. best regards.'
Upvotes: 3
Reputation: 333
You can write a function by yourself:
def function(string):
while '\\' in string:
ind = string.find('\\')
string = string[:ind] + string[ind+2:]
return string
Upvotes: 0
Reputation: 301
It seems you want to get rid of the line returns. If so, you don't need the re module, just use:
string.replace("\r\n", "")
Upvotes: 0
Reputation: 16660
You are confusing a representation of whitechars (please read more about them here).
You should rather be looking for \r
, \n
characters this way:
re.findall(r'\n\w+', string)
or
re.findall(r'\r\w+', string)
Upvotes: 0