Reputation: 44441
According to the documentation, it is possible to define the encoding of the literals used in the python source like this:
# -*- coding: latin-1 -*-
u = u'abcdé' # This is a unicode string encoded in latin-1
Is there any syntax support to specify the encoding on a literal basis? I am looking for something like:
latin1 = u('latin-1')'abcdé' # This is a unicode string encoded in latin-1
utf8 = u('utf-8')'xxxxx' # This is a unicode string encoded in utf-8
I know that syntax does not make sense, but I am looking for something similar. What can I do? Or is it maybe not possible to have a single source file with unicode strings in different encodings?
Upvotes: 1
Views: 78
Reputation: 1123780
There is no way for you to mark a unicode
literal as having using a different encoding from the rest of the source file, no.
Instead, you'd manually decode the literal from a bytestring instead:
latin1 = 'abcdé'.decode('latin1') # provided `é` is stored in the source as a E9 byte.
or using escape sequences:
latin1 = 'abcd\xe9'.decode('latin1')
The whole point of the source-code codec line is to support using an arbitrary codec in your editor. Source code should never use mixed encodings, really.
Upvotes: 1