Reputation: 10122
Consider the following simple class:
>>> class W(object):
... def __str__(self):
... print "entering __str__"
... return u"a"
... w = W()
Please notice that:
__str__
method.__str__
method returns a rogue unicode value.w
of class W
in the following doctests.Now, first consider this relatively intuitive doctest session:
>>> u"%s" % w
entering __str__
u'a'
>>> w.__str__()
entering __str__
u'a'
WTF doctest session:
>>> "%s" % w
entering __str__
entering __str__
u'a'
>>> str(w)
entering __str__
'a'
Can you explicit why:
__str__
is called twice in the first example ?w.__str__()
doesn't provide the same output than str(w)
?Thanks for your insights on these topics... any pointers on docs (or better... code!) is welcome.
Upvotes: 0
Views: 171
Reputation: 414
Let's find out what's going on here. First we need to figure out the op-code for the % operator:
>>> import dis
>>> def modop():
... '%s' % w
...
>>> dis.dis(modop)
2 0 LOAD_CONST 1 ('%s')
3 LOAD_GLOBAL 0 (w)
6 BINARY_MODULO
7 POP_TOP
8 LOAD_CONST 0 (None)
11 RETURN_VALUE
OK, so we need to check ceval.c for the BINARY_MODULO opcode to see what python is doing. Here's the source (Python-2.7.6\Python\ceval.c):
case BINARY_MODULO:
w = POP();
v = TOP();
if (PyString_CheckExact(v))
x = PyString_Format(v, w);
else
x = PyNumber_Remainder(v, w);
Py_DECREF(v);
Py_DECREF(w);
SET_TOP(x);
if (x != NULL) continue;
break;
Doing a search of the Python source for "PyString_Format" we find the function is defined in Python-2.7.6\Objects\stringobject.c. Around line 4447 we find:
#ifdef Py_USING_UNICODE
if (PyUnicode_Check(v)) {
fmt = fmt_start;
argidx = argidx_start;
goto unicode;
}
#endif
temp = _PyObject_Str(v);
#ifdef Py_USING_UNICODE
if (temp != NULL && PyUnicode_Check(temp)) {
Py_DECREF(temp);
fmt = fmt_start;
argidx = argidx_start;
goto unicode;
}
#endif
The goto jumps to unicode: , which then calls
v = PyUnicode_Format(format, args);
So, to explain
>>> "%s" % w
entering __str__
entering __str__
u'a'
My best bet is that PyUnicode_Check
has to call __str__
to determine whether the Object's string representation is Unicode or not. That returns true for the check, which then calls PyUnicode_Format
which calls __str__
again. This is a bit of a guess though, because I haven't thoroughly read these functions.
str()
will always return Type str, not unicode, so that makes sense.
Upvotes: 1