Reputation: 381
I knew that I could get unicode characters using the escape sequence, like this:
>>> print "\3"
♥
and I just wanted to look through available ASCII characters and written this:
for i in xrange(1, 99):
print "\%o" % i
and it prints "\1", "\2", "\3", etc., so not unicode characters. I then tried it using %s, %r, and %d and none of those seem to work either.
It was much more interesting than seeing available ASCII characters so I started reading about string formating and ended up with this piece working:
for i in xrange(1, 99):
print "{:c}".format(i)
The question is - why the initial code wasn't working?
Upvotes: 1
Views: 990
Reputation: 114461
Escape sequences in string literals are processed at "parse time", not at "run time". If you write
"\%o"
Python parser sees a backslash followed by a percent sign and because this is not a valid escape sequence it will just keep both characters and then will also add o
as a normal character (note that in this Python is different from e.g. the C++ programming language that it would have interpreted that string just as "%o"
because in that language a backslash before a percent sign is interpreted as a percent sign only).
At run time the formatting operator will see as left side a string composed by three characters, a backslash and a %o
sequence and that is the part that will be replaced by the right-hand side giving for example the string "\\1"
for the input value 1 and that string is displayed as \1
.
Upvotes: 2
Reputation: 601401
String literals in Python source code are interpreted during lexical analysis – the first step of source code processing the Python compiler performs. The escape sequences are parsed, and only the resulting string is stored in memory. This is why e.g.
>>> "A"
'A'
>>> "\x41"
'A'
result in exactly the same string. Escape sequences are not processed while actually printing the string, or while performing string formatting. Printing basically means to copy the contents of the string to the terminal. Formatting means to interpolate the %
or {}
placeholders with the desired contents. The rest of the string is left unchanged.
The result of the formatting opartion
>>> "\%03o" % 65
'\\101'
is a string of four characters \101
. (In the interactive interpreter, a representation of this string is shown; that's why you see the quotes and the double back slash.) The string literal "\101"
on the other hand is a string with only a single character, namely a capital A
.
As pointed out by Martijn Pieters, you can explicitly request interpretation of escape sequences with the string_escape
codec:
>>> ("\%03o" % 65).decode("string_escape")
'A'
Upvotes: 1
Reputation: 1121306
Python is interpreting \%o
as 'literal backslash followed by a string formatting code'; \%
doesn't mean anything in a python literal so the backslash is included literally.
You are looking for the chr()
function:
for i in xrange(1, 99):
print chr(i)
The \
character escapes only work in python literals. You can instruct python to interpret an arbitrary string containing a literal \
backslash pus code to be interpreted as a python string literal using the string_escape
codec:
>>> print repr('\\n'.decode('string_escape')
'\n'
Note that the proper way to specify a unicode literal is to use the \uxxxx
format, and to use a unicode string literal:
>>> print u'\u2665'
♥
Raw bytes can also be generated using the \x00
escape sequence:
>>> print repr('\x12')
'\n'
Upvotes: 2