Reputation: 112
I am currently working on a program that uses an online OCR API. This API takes 2-5 seconds to send me a processed image, so instead of making the user wait for all images to be processed, the user can start working on the first image while the rest are processed on a different instance of python using multiprocessing. I have been using multiprocessing.Pipe()
to send values back and forth. The code is here:
import multiprocessing as mp
# importing cv2, PIL, os, json, other stuff
def image_processor():
# processes the first image in the list, then moves the remaining images to a different python instance:
p_conn, c_conn = mp.Pipe()
p = mp.Process(target=Processing.worker, args=([c_conn, images, path], 5))
p.start()
while True:
out = p_conn.recv()
if not out:
break
else:
im_data.append(out)
p_conn.send(True)
class Processing:
def worker(data, mode, headers=0):
# (some if statements go here)
elif mode == 5:
print(data[0])
for im_name in data[1]:
if data[1].index(im_name) != 0:
im_path = f'{data[2]}\{im_name}' # find image path
im = pil_img.open(im_path).convert('L') # open and grayscale image with PIL
os.rename(im_path, f'{data[2]}\Archive\{im_name}') # move original to archive
im_grayscale = f'{data[2]}\g_{im_name}' # create grayscale image path
im.save(im_grayscale) # save grayscale image
ocr_data = json.loads(bl.Visual.OCR.ocr_space_file(im_grayscale)).get('ParsedResults')[0].get('ParsedText').splitlines()
print(ocr_data)
data[0].send([im_name, f'{data[2]}\Archive\{im_name}', ocr_data])
data[0].recv()
data[0].send(False)
This leaves me with the following traceback:
Process Process-1:
Traceback (most recent call last):
File "C:\Users\BruhK\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\BruhK\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "c:\Users\BruhK\PycharmProjects\pythonProject\FleetFeet-OCR-Final.py", line 275, in worker
data[0].send([{im_name}, f'{data[2]}\Archive\{im_name}', ocr_data])
File "C:\Users\BruhK\AppData\Local\Programs\Python\Python310\lib\multiprocessing\connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "C:\Users\BruhK\AppData\Local\Programs\Python\Python310\lib\multiprocessing\connection.py", line 285, in _send_bytes
ov, err = _winapi.WriteFile(self._handle, buf, overlapped=True)
BrokenPipeError: [WinError 232] The pipe is being closed
Note that the data sent from the child function to the parent was a 2d or 3d array. In testing I've been able to send 2d and 3d arrays back and forth between child and parent functions.
An example of the code I used for testing is as follows:
import multiprocessing as mp
import random
import time
def hang(p):
hang_time = random.randint(1, 5)
time.sleep(hang_time)
print(p)
p.send(hang_time)
time.sleep(1)
class Child:
def process():
start = time.time()
p_conn, c_conn = mp.Pipe()
p = mp.Process(target=hang, args=(c_conn,))
p.start()
out = p_conn.recv()
print(f'Waited for {time.time() - start}')
p.join()
print(f'New time: {time.time() - start}')
return out
class Parent:
def run():
# do some stuff
print(f'Hang time: {Child.process()}')
# do some stuff
if __name__ == '__main__':
Parent.run()
How do I fix this issue? Is there any additional information needed?
Upvotes: 1
Views: 1852
Reputation: 1127
it looks you have wrong indenting: The data[0].send(False) is inside the for loop, so the it sends the False after processing the first image and your main process exits the while(True)
Upvotes: 1
Reputation: 112
As @tturbo pointed out, the code data[0].send(False)
was within the for
loop that it was supposed to be outside of, and this stopped the broken pipe error. I'm not sure why that fixed it, if anyone else would be willing to shed some light on it be my guest. For me, what matters is that it worked. Thank you.
Upvotes: 1