Guy Ohayon
Guy Ohayon

Reputation: 45

Segmentation fault (core dumped) when trying to use subprocess in python

I am trying to run a program many times in parallel (with different arguments). I searched online and found that subprocess in python is a good way to do that. My code is the following:

import subprocess
import os

models_path="~/"
procs = []
num_of_procs_running=0
i = 0
for model_name in os.listdir(models_path):
    if num_of_procs_running > num_of_procs:
        for p in procs:
            p.wait()
        procs = []
        num_of_procs_running = 0
    elif model_name.endswith(".onnx"):
        name, ending = model_name.split(".")
        runner = './SOME_PROGRAM '+str(i)+'> output.txt'
        i+=1
        procs.append(subprocess.Popen(runner,shell=True))
        num_of_procs_running += 1
        print("Total processes:",num_of_procs_running)
        print("\n")
for p in procs:
    p.wait()

I am getting Segmentation Fault(core dumped) if I am trying to run more than 56 subprocesses. My machine has 56 CPU's and and each CPU has 12 cores. How can I run more than 56 subprocesses, or maybe use threads? Thanks.

Upvotes: 1

Views: 3375

Answers (1)

asynts
asynts

Reputation: 2423

  1. As @danielpryden already suggested don't invoke a shell:

    import subprocess
    
    # bad (mocking your code)
    p = subprocess.Popen("echo 1 > output.txt", shell=True)
    p.wait()
    
    # better
    fp = open("output.txt", "w")
    p = subprocess.Popen(["echo", "1"], stdout=fp)
    
    p.wait() # or do something else
    fp.close() # remember to close file
    
  2. When reaching the maximum number of threads, it waits until all threads have finished, quite inefficient. What you seemingly try to achieve is exactly what multiprocessing.pool.Pool does:

    from multiprocessing.pool import Pool
    
    # mocking your code
    modules = ["mod1.onnx", "mod2.onnx"]
    
    def run(module):
      print("running", module)
    
    # use the maximum number of threads by default
    with Pool() as pool:
      pool.map(run, modules)
    
  3. The segmentation fault isn't caused by your Python application, but rather from whatever program you try to execute. If it's C++ have a look at std::thread::hardware_concurrency; You need to change the program code not your python script.

Upvotes: 2

Related Questions