roger
roger

Reputation: 9893

why will multiprocess spawn multi threads in every process?

I am using python multiprocessing, here is a simple example:

from multiprocessing import Pool
import time
import signal

def process(_id):
    time.sleep(2)
    return _id

def init_worker():
    signal.signal(signal.SIGINT, signal.SIG_IGN)

def main():
    pool = Pool(1, init_worker)
    for res in pool.imap(process, range(1000)):
        print res

if __name__ == "__main__":
    main()

this runs ok, what confused me is that:

# ps -eLaf | grep test_multi
cuidehe   4119  4118  4119  2    4 11:06 pts/25   00:00:00 python test_multi.py
cuidehe   4119  4118  4121  0    4 11:06 pts/25   00:00:00 python test_multi.py
cuidehe   4119  4118  4122  0    4 11:06 pts/25   00:00:00 python test_multi.py
cuidehe   4119  4118  4123  0    4 11:06 pts/25   00:00:00 python test_multi.py
cuidehe   4120  4119  4120  0    1 11:06 pts/25   00:00:00 python test_multi.py

aw you can see, I just forked one process, its pid is 4120, so I think the pid 4119 is the main process, but why 4 threads?

one thing to point out is that, not always 4 threads, for example:

pool = Pool(1, init_worker)
cursor = parse_db["jd_raw"].find({"isExpired": 0},
        {"jdJob.jobPosition": 1, "jdJob.jobDesc": 1, "jdFrom": 1}, no_cursor_timeout=True).\
                batch_size(15)

for res in pool.imap(process, cursor):
    pass

this time is 6 :

cuidehe   4522  2655  4522 21    6 11:28 pts/25   00:00:00 python test_multi_mongo.py
cuidehe   4522  2655  4525  0    6 11:28 pts/25   00:00:00 python test_multi_mongo.py
cuidehe   4522  2655  4527  0    6 11:28 pts/25   00:00:00 python test_multi_mongo.py
cuidehe   4522  2655  4528 54    6 11:28 pts/25   00:00:01 python test_multi_mongo.py
cuidehe   4522  2655  4529 46    6 11:28 pts/25   00:00:00 python test_multi_mongo.py
cuidehe   4522  2655  4530  0    6 11:28 pts/25   00:00:00 python test_multi_mongo.py
cuidehe   4526  4522  4526 28    1 11:28 pts/25   00:00:00 python test_multi_mongo.py

And also, not only main process will spawn child threads, but also child process will spawn child threads, so why multiprocess still needs to spawn child threads?

Upvotes: 0

Views: 329

Answers (1)

Roland Smith
Roland Smith

Reputation: 43495

The multiprocessing module uses three separate threads to manage a Pool in the background while your main program can continue. See multiprocessing/pool.py in your Python installation.

Upvotes: 2

Related Questions