Reputation: 4633
I've been exploring the internal implementation of threads in Python this week. It's amazing how everyday I get amazed by how much I didn't know; not knowing what I want to know, that's what makes me itch.
I noticed something strange in a piece of code that I ran under Python 2.7 as a mutlithreaded application. We all know that Python 2.7 switches between threads after 100 virtual instructions by default. Calling a function is one virtual instruction, for example:
>>> from __future__ import print_function
>>> def x(): print('a')
...
>>> dis.dis(x)
1 0 LOAD_GLOBAL 0 (print)
3 LOAD_CONST 1 ('a')
6 CALL_FUNCTION 1
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
As you can see, after loading global print
and after loading the constant a
the function gets called. Calling a function therefore is atomic as it's done with a single instruction. Hence, in a multithreaded program either the function (print
here) runs or the 'running' thread gets interrupted before the function gets the change to run. That is, if a context switch occurs between LOAD_GLOBAL
and LOAD_CONST
, the instruction CALL_FUNCTION
won't run.
Keep in mind that in the above code I'm using from __future__ import print_function
, I'm really calling a builtin function now not the print
statement. Let's take a look at the byte code of function x
but this time with the print
statement:
>>> def x(): print "a" # print stmt
...
>>> dis.dis(x)
1 0 LOAD_CONST 1 ('a')
3 PRINT_ITEM
4 PRINT_NEWLINE
5 LOAD_CONST 0 (None)
8 RETURN_VALUE
It's quite possible in this case that a thread context switch may occur between LOAD_CONST
and PRINT_ITEM
, effectively preventing PRINT_NEWLINE
instruction from executing. So if you have a multithreaded program like this (borrowed from Programming Python 4th edition and slightly modified):
def counter(myId, count):
for i in range(count):
time.sleep(1)
print ('[%s] => %s' % (myId, i)) #print (stmt) 2.X
for i in range(5):
thread.start_new_thread(counter, (i, 5))
time.sleep(6) # don't quit early so other threads don't die
The output may or may not look like this depending on how threads were switched:
[0] => 0
[3] => 0[1] => 0
[4] => 0
[2] => 0
...many more...
This is all okay with the print
statement.
What happens if we change print
statement with the builtin print
function? Let's see:
from __future__ import print_function
def counter(myId, count):
for i in range(count):
time.sleep(1)
print('[%s] => %s' % (myId, i)) #print builtin (func)
for i in range(5):
thread.start_new_thread(counter, (i, 5))
time.sleep(6)
If you run this script long enough and multiple times, you'll see something like this:
[4] => 0
[3] => 0[1] => 0
[2] => 0
[0] => 0
...many more...
Given all the above explanation how can this be? print
is a function now, how come that it prints the passed-in string but not the new line? The print
prints the value of end
at the end of the printed string, it's set by default to \n
. Essentially, a call to function is atomic, how on planet earth it got interrupted?
Let's blow our minds:
def counter(myId, count):
for i in range(count):
time.sleep(1)
#sys.stdout.write('[%s] => %s\n' % (myId, i))
print('[%s] => %s\n' % (myId, i), end='')
for i in range(5):
thread.start_new_thread(counter, (i, 5))
time.sleep(6)
Now the new line is always printed, no jumbled output anymore:
[1] => 0
[2] => 0
[0] => 0
[4] => 0
...many more...
The Addition of \n
to the string now obviously proves that print
function is not atomic (even though it's a function) and essentially it just acts as if it's the print
statement. dis.dis
however informs us incoherently or stupidly that it's a simple function and thus an atomic operation?!
Note: I never rely on the order or timing of threads for applications to work properly. This is just for testing purposes only and frankly for geeks like me.
Upvotes: 1
Views: 270
Reputation: 280564
Your question is based on the central premise
Calling a function therefore is atomic as it's done with a single instruction.
which is thoroughly wrong.
First, executing the CALL_FUNCTION
opcode can involve executing additional bytecode. The most obvious case of this is when the executed function is written in Python, but even built-in functions can freely call other code that may be written in Python. For example, print
calls __str__
and write
methods.
Second, Python is free to release the GIL even in the middle of C code. It commonly does this for I/O and other operations that might take a while without needing to perform Python API calls. There are 23 uses of the FILE_BEGIN_ALLOW_THREADS
and Py_BEGIN_ALLOW_THREADS
macros in the Python 2.7 file object implementation alone, including one in the implementation of file.write
, which print
relies on.
Upvotes: 2