Anand
Anand

Reputation: 2359

Processing multiple files on multiple cores not working

I have several files in this location C:/workspace in my machine, and for each file I need to do some post processing. So I tried to use the Multiprocessing concept, but it seems to be not calling my target function. Please check the code:

from multiprocessing import Process

def execute_all(f):
    print("Working on file: ", f)
    #more code


if __name__ == '__main__':    
    for file in glob.iglob('C:/workspace/**/*.c', recursive=True):
        files.append(file)

    ListOfProcesses = []
    Processors = 5 
    Parts = [files[i:i + Processors] for i in range(0, len(files), Processors)]
    print("Parts" , Parts)

    for part in Parts:
        for f in part:
            print("File ", f)
            p = Process(target=execute_all, args=(f))
            print("Process called")
            p.start()
            print("Process started")
            ListOfProcesses.append(p)
        for p in ListOfProcesses:
            print(p)
            p.join()    

I tried to add several print statements purposefully in roder to check if the process is getting started. Here is the output:

Parts [['C:/workspace/test\\a.c', 'C:/workspace/test\\b.c', 'C:/workspace/test\\c.c']]
File  C:/workspace/test\a.c
Process called
Process started
File  C:/workspace/test\b.c
Process called
Process started
File  C:/workspace/test\c.c
Process called
Process started
<Process name='Process-49' pid=23188 parent=16640 started>
<Process name='Process-50' pid=31464 parent=16640 started>
<Process name='Process-51' pid=17016 parent=16640 stopped exitcode=1>

So, process is getting called but it's not calling my target function which is execute_all in this case. Any thing which I am missing here?

Upvotes: 0

Views: 75

Answers (1)

Henry Harutyunyan
Henry Harutyunyan

Reputation: 2425

One of the issues that caught my eye is how you pass the args to the Process. You need to pass a tuple, even if you have a single argument. So (f) is not a tuple and may cause some issues. To properly make a tuple with a single element you need to add a comma there, something like (f,)


So the following code works for me

from multiprocessing import Process


def execute_all(f):
    print("Working on file: ", f)
    # more code


if __name__ == '__main__':
    files = []

    for file in ['test1', 'test2', 'test3', 'test4', 'test5']:
        files.append(file)

    ListOfProcesses = []
    Processors = 5
    Parts = [files[i:i + Processors] for i in range(0, len(files), Processors)]
    print("Parts", Parts)

    for part in Parts:
        for f in part:
            print("File ", f)
            p = Process(target=execute_all, args=(f,))
            print("Process called")
            p.start()
            print("Process started")
            ListOfProcesses.append(p)
        for p in ListOfProcesses:
            print(p)
            p.join()

The output is as follows

Parts [['test1', 'test2', 'test3', 'test4', 'test5']]
File  test1
Process called
Process started
File  test2
Process called
Process started
File  test3
Process called
Process started
File  test4
Process called
Process started
File  test5
Process called
Process started
<Process name='Process-1' pid=14839 parent=14837 started>
Working on file:  test1
Working on file:  test2
Working on file:  test3
Working on file:  test5
Working on file:  test4
<Process name='Process-2' pid=14840 parent=14837 started>
<Process name='Process-3' pid=14841 parent=14837 stopped exitcode=0>
<Process name='Process-4' pid=14842 parent=14837 started>
<Process name='Process-5' pid=14843 parent=14837 stopped exitcode=0>

Upvotes: 2

Related Questions