Is the `print` builtin function in Python 2.X atomic?

Question

I've been exploring the internal implementation of threads in Python this week. It's amazing how everyday I get amazed by how much I didn't know; not knowing what I want to know, that's what makes me itch.

I noticed something strange in a piece of code that I ran under Python 2.7 as a mutlithreaded application. We all know that Python 2.7 switches between threads after 100 virtual instructions by default. Calling a function is one virtual instruction, for example:

>>> from __future__ import print_function
>>> def x(): print('a')
... 
>>> dis.dis(x)
  1           0 LOAD_GLOBAL              0 (print)
              3 LOAD_CONST               1 ('a')
              6 CALL_FUNCTION            1
              9 POP_TOP             
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

As you can see, after loading global print and after loading the constant a the function gets called. Calling a function therefore is atomic as it's done with a single instruction. Hence, in a multithreaded program either the function (print here) runs or the 'running' thread gets interrupted before the function gets the change to run. That is, if a context switch occurs between LOAD_GLOBAL and LOAD_CONST, the instruction CALL_FUNCTIONwon't run.

Keep in mind that in the above code I'm using from __future__ import print_function, I'm really calling a builtin function now not the print statement. Let's take a look at the byte code of function x but this time with the print statement:

>>> def x(): print "a"          # print stmt
... 
>>> dis.dis(x)
  1           0 LOAD_CONST               1 ('a')
              3 PRINT_ITEM          
              4 PRINT_NEWLINE       
              5 LOAD_CONST               0 (None)
              8 RETURN_VALUE

It's quite possible in this case that a thread context switch may occur between LOAD_CONST and PRINT_ITEM, effectively preventing PRINT_NEWLINE instruction from executing. So if you have a multithreaded program like this (borrowed from Programming Python 4th edition and slightly modified):

def counter(myId, count):
    for i in range(count):
        time.sleep(1)
        print ('[%s] => %s' % (myId, i)) #print (stmt) 2.X 

for i in range(5):
    thread.start_new_thread(counter, (i, 5))

time.sleep(6)  # don't quit early so other threads don't die

The output may or may not look like this depending on how threads were switched:

[0] => 0
[3] => 0[1] => 0
[4] => 0
[2] => 0
...many more...

This is all okay with the print statement.

What happens if we change print statement with the builtin print function? Let's see:

from __future__ import print_function
def counter(myId, count):
    for i in range(count):
        time.sleep(1)

        print('[%s] => %s' % (myId, i))  #print builtin (func)

for i in range(5):
    thread.start_new_thread(counter, (i, 5))

time.sleep(6)

If you run this script long enough and multiple times, you'll see something like this:

[4] => 0
[3] => 0[1] => 0
[2] => 0
[0] => 0
...many more...

Given all the above explanation how can this be? print is a function now, how come that it prints the passed-in string but not the new line? The print prints the value of end at the end of the printed string, it's set by default to . Essentially, a call to function is atomic, how on planet earth it got interrupted?

Let's blow our minds:

def counter(myId, count):
    for i in range(count):
        time.sleep(1)
        #sys.stdout.write('[%s] => %s
' % (myId, i))
        print('[%s] => %s
' % (myId, i), end='')

for i in range(5):
    thread.start_new_thread(counter, (i, 5))

time.sleep(6)

Now the new line is always printed, no jumbled output anymore:

[1] => 0
[2] => 0
[0] => 0
[4] => 0
...many more...

The Addition of to the string now obviously proves that print function is not atomic (even though it's a function) and essentially it just acts as if it's the print statement. dis.dis however informs us incoherently or stupidly that it's a simple function and thus an atomic operation?!

Note: I never rely on the order or timing of threads for applications to work properly. This is just for testing purposes only and frankly for geeks like me.

user2357112 · Accepted Answer

Your question is based on the central premise

Calling a function therefore is atomic as it's done with a single instruction.

which is thoroughly wrong.

First, executing the CALL_FUNCTION opcode can involve executing additional bytecode. The most obvious case of this is when the executed function is written in Python, but even built-in functions can freely call other code that may be written in Python. For example, print calls __str__ and write methods.

Second, Python is free to release the GIL even in the middle of C code. It commonly does this for I/O and other operations that might take a while without needing to perform Python API calls. There are 23 uses of the FILE_BEGIN_ALLOW_THREADS and Py_BEGIN_ALLOW_THREADS macros in the Python 2.7 file object implementation alone, including one in the implementation of file.write, which print relies on.

Is the `print` builtin function in Python 2.X atomic?

Answers (1)

Related Questions