Reputation: 11
Somewhat new to python, to be honest not very familiar with encoding in Python
Suppose while parsing text/html inputs, I end up with a path that look like following
line = \\dfslocation\prj\gct\asw\sw_archive
However in the earlier part of processing, it seems like the escaped sequence '\a' and \'t' is already no longer stored as literal.
literal_line = "%r"%(line)
print literal_line
\\dfslocation\prj\gct\x07sw\\sw_archive
My best guess is it happened when I tried to convert email into text
for part in self.msg.walk():
if part.get_content_type().startswith('text/plain'):
plain_text_part = part.get_payload(decode=False)
received_text += '\n'
received_text += plain_text_part
received_text = received_text.encode('ascii', 'ignore')
Later on I want to use this as a network path, which would require this to be in its literal form - ie \a, not \x07 (ASCII Bel character)
The brute force way I can think of, would be searching all escaped sequence https://docs.python.org/2.0/ref/strings.html, and replacing them with corresponding string literals.
Is there a better way to do this?
Thanks
Upvotes: 1
Views: 746
Reputation: 11473
Try storing line variable content as raw instead of ASCII.
If you store as it is, the \a
will get converted to x07
.
>>> line = "\\dfslocation\prj\gct\asw\sw_archive"
>>> line
'\\dfslocation\\prj\\gct\x07sw\\sw_archive'
However, if you store as raw, using r'<your_ascii_text>'
format, it will not convert to special characters.
>>> line = r'\\dfslocation\prj\gct\asw\sw_archive'
>>> print line
\\dfslocation\prj\gct\asw\sw_archive
>>>
Raw strings treat \a
as \a
, making them well adapted for windows filenames and regular expressions.
Upvotes: 1