Why am I able to run more concurrent threads than I have cpus?

Question

My CPU has 4 cores so as I understand threads in Python, this means I can only run 4 threads concurrently. I wanted to test this out so I wrote:

import threading
from time import sleep

def print_after_sleep(msg):
    sleep(5)
    print(msg)

one = threading.Thread(target=print_after_sleep, args=[1])
two = threading.Thread(target=print_after_sleep, args=[2])
three = threading.Thread(target=print_after_sleep, args=[3])
four = threading.Thread(target=print_after_sleep, args=[4])
five = threading.Thread(target=print_after_sleep, args=[5])
six = threading.Thread(target=print_after_sleep, args=[6])
one.start()
sleep(0.2)
two.start()
sleep(0.2)
three.start()
sleep(0.2)
four.start()
sleep(0.2)
five.start()
sleep(0.2)
six.start()

I would expect threads 1-4 to print their messages in rapid succession, but then to wait nearly 5 seconds for threads 5-6 to print their messages. What I actually got, was all threads printing their messages in rapid succession.

Susmit Agrawal · Accepted Answer

Let's talk about threads first.

There are almost always more threads in a modern day application than there are cores in the machine's CPU. Each thread gets assigned a particular core. The core then executes each thread in a time-shared fashion. The thread to be executed is decided by the Operating System's scheduler.

Your misconception here is the amount of time the scheduler takes to switch between threads. The switch is done in the order of microseconds (1e-6 seconds), AT MOST. Taking 5-6 seconds to switch threads will render most software today useless.

Coming to specifics about python, the language is restricted by a Global Interpreter Lock, or GIL. In short, this may cause all your threads to run on a single core.

EDIT

Threads have different states, that help the OS scheduler optimize their operations.

I'm not going in the details of the above diagram. All you need the know are the following:

Multiple threads can be in the 'ready' state in a single core. This essentially means that they can be executed when the scheduler allows them.
Only one thread per core can be 'running'. All other threads will be 'ready' or 'blocked' (aka 'waiting').
When you call sleep, the thread basically moves into a different state. The moment there is no 'running' thread, the scheduler will immediately assign a 'ready' thread to run. In your case, this is a thread that has finished waiting for 5 seconds.

In your program, the naive interpretation will be that all threads are 'blocked' for 5 seconds. The core assigned is probably not even running your program at the time. Then, all threads simultaneously move to 'ready' state, where the scheduler executes them one at a time.

Why am I able to run more concurrent threads than I have cpus?

Answers (2)

Related Questions