Reputation: 703
I use python threading.Thread to spawn threads that execute a small utility for every filename found in os.walk() and get its output. I tried limiting number of threads using:
ThreadLimiter = threading.BoundedSemaphore(3)
and
ThreadLimiter.acquire()
in start of run method and
ThreadLimiter.release()
at end of run method
But I still get the below error message when I run the python program. Any suggestions on improving this ?
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
Upvotes: 0
Views: 1129
Reputation: 77397
Use a thread pool and save yourself a lot of work! Here I md5sum files:
import os
import multiprocessing.pool
import subprocess as subp
def walker(path):
"""Walk the file system returning file names"""
for dirpath, dirs, files in os.walk(path):
for fn in files:
yield os.path.join(dirpath, fn)
def worker(filename):
"""get md5 sum of file"""
p = subp.Popen(['md5sum', filename], stdin=subp.PIPE,
stdout=subp.PIPE, stderr=subp.PIPE)
out, err = p.communicate()
return filename, p.returncode, out, err
pool = multiprocessing.pool.ThreadPool(3)
for filename, returncode, out, err in pool.imap(worker, walker('.'), chunksize=1):
print(filename, out.strip())
Upvotes: 1