Reputation: 2923
I have a simple sample code running in Python3.8
, that opens a subprocess
that executes in Python2.7
(utilizing multiprocessing).
In Windows 10
, the behavior of my code is what my intention is. Where the Python2 pool runs and prints to stdout
accordingly. And the main.py
reads the stdout almost instantenously, as the pool writes on it.
Unfortunately, I am seeing different results with Linux (Ubuntu 20.04.1 LTS
). It seems, in Linux, I wont get anything back until the whole pool is completed.
How can I make the code work the same in Linux as well?
Please see simple sample code below and output I am getting.
Main.py
import subprocess
import datetime
import tempfile
import os
def get_time():
return datetime.datetime.now()
class ProcReader():
def __init__(self, python_file, temp=None, wait=False):
self.proc = subprocess.Popen(['python2', python_file], stdout=subprocess.PIPE)
def __iter__(self):
return self
def __next__(self):
while True:
line = self.proc.stdout.readline()
if not line:
raise StopIteration
return line
if __name__ == "__main__":
r1 = ProcReader("p2.py")
for l1 in r1:
print("Main reading at: {} for {}".format(get_time(), l1))
p2.py
import time
import multiprocessing as mp
from multiprocessing import freeze_support
import datetime
def get_time():
return datetime.datetime.now()
def f1(name):
for x in range(2):
time.sleep(1)
print("{} Job#: {} from f1".format(get_time(), name))
def f2(name):
for x in range(2):
time.sleep(2)
print("{} Job#: {} from f2".format(get_time(), name))
if __name__ == '__main__':
freeze_support()
pool = mp.Pool(2)
tasks = ["1", "2", "3", "4", "5", "6", "7"]
for i, task in enumerate(tasks):
if i%2:
pool.apply_async(f2, args=(task,))
else:
pool.apply_async(f1, args=(task,))
pool.close()
pool.join()
Output for Windows:
Main reading at: 2020-09-24 15:28:19.044626 for b'2020-09-24 15:28:19.044000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.045454 for b'2020-09-24 15:28:20.045000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.046711 for b'2020-09-24 15:28:20.046000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:21.045510 for b'2020-09-24 15:28:21.045000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.046334 for b'2020-09-24 15:28:22.046000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.047368 for b'2020-09-24 15:28:22.047000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:23.047519 for b'2020-09-24 15:28:23.047000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:24.046356 for b'2020-09-24 15:28:24.046000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:24.048356 for b'2020-09-24 15:28:24.048000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:26.047307 for b'2020-09-24 15:28:26.047000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:26.049168 for b'2020-09-24 15:28:26.049000 Job#: 6 from f2\n'
Main reading at: 2020-09-24 15:28:27.047897 for b'2020-09-24 15:28:27.047000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.048337 for b'2020-09-24 15:28:28.048000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.049367 for b'2020-09-24 15:28:28.049000 Job#: 6 from f2\n'
Output for Linux:
Main reading at: 2020-09-24 19:28:45.972346 for b'2020-09-24 19:28:36.932473 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972559 for b'2020-09-24 19:28:37.933594 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972585 for b'2020-09-24 19:28:38.935255 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972597 for b'2020-09-24 19:28:39.936297 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972685 for b'2020-09-24 19:28:40.937666 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972711 for b'2020-09-24 19:28:41.938629 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972724 for b'2020-09-24 19:28:43.941109 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.972735 for b'2020-09-24 19:28:45.943310 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.973115 for b'2020-09-24 19:28:37.933317 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973139 for b'2020-09-24 19:28:39.935938 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973149 for b'2020-09-24 19:28:41.938587 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973157 for b'2020-09-24 19:28:43.941109 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973165 for b'2020-09-24 19:28:44.942306 Job#: 7 from f1\n'
Main reading at: 2020-09-24 19:28:45.973173 for b'2020-09-24 19:28:45.943503 Job#: 7 from f1\n'
Please disregard the time since the clocks are different, but as you can see, in Windows main.py
gets it as soon as it was written in python2 pool, but for linux
everything in main.py
is written only when all jobs were completed. I am not too concerned about the order of the jobs being completed, I really just want main.py
to get the stdout
as soon as it is written in Python2 pool.
Upvotes: 1
Views: 46
Reputation: 3514
stdout
on Linux is buffered and print()
on multiprocessing is not flushed because process not controls terminal.
Monkey patch of sys.stdout is useful here
import sys,os
unbuffered = os.fdopen(sys.stdout.fileno(), 'w', 0)
sys.stdout = unbuffered
Or you might have to call sys.stdout.flush()
after every print()
Upvotes: 1