Tao Yang
Tao Yang

Reputation: 11

Python - How to convert escaped sequence to string literals

Somewhat new to python, to be honest not very familiar with encoding in Python

Suppose while parsing text/html inputs, I end up with a path that look like following

line = \\dfslocation\prj\gct\asw\sw_archive

However in the earlier part of processing, it seems like the escaped sequence '\a' and \'t' is already no longer stored as literal.

literal_line = "%r"%(line)
print literal_line   

\\dfslocation\prj\gct\x07sw\\sw_archive

My best guess is it happened when I tried to convert email into text

for part in self.msg.walk():
  if part.get_content_type().startswith('text/plain'):
    plain_text_part = part.get_payload(decode=False)
    received_text += '\n'
    received_text += plain_text_part

received_text = received_text.encode('ascii', 'ignore')

Later on I want to use this as a network path, which would require this to be in its literal form - ie \a, not \x07 (ASCII Bel character)

The brute force way I can think of, would be searching all escaped sequence https://docs.python.org/2.0/ref/strings.html, and replacing them with corresponding string literals.

Is there a better way to do this?

Thanks

Upvotes: 1

Views: 746

Answers (1)

Anil_M
Anil_M

Reputation: 11473

Try storing line variable content as raw instead of ASCII.

If you store as it is, the \a will get converted to x07.

>>> line = "\\dfslocation\prj\gct\asw\sw_archive"
>>> line
'\\dfslocation\\prj\\gct\x07sw\\sw_archive'

However, if you store as raw, using r'<your_ascii_text>' format, it will not convert to special characters.

>>> line  = r'\\dfslocation\prj\gct\asw\sw_archive'
>>> print line
\\dfslocation\prj\gct\asw\sw_archive
>>>

Raw strings treat \a as \a, making them well adapted for windows filenames and regular expressions.

Upvotes: 1

Related Questions