tomfriwel
tomfriwel

Reputation: 2635

How to control memory usage in multithreading?

I am using multi thread to process image.

It works fine on my computer that has enough memory (increase 2~3 GB when processing many images), but my server only has 1GB memory and the code not work properly.

Sometimes end with Segmentation fault, sometimes:

Exception in thread Thread-13:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "passportRecognizeNew.py", line 267, in doSomething
  ...

Code:

import threading

def doSomething(image):
    # picture processing code
    print("processing over")

threads = []

for i in range(20):
    thread = threading.Thread(target=doSomething, args=("image",))
    threads.append(thread)

for t in threads:
    t.setDaemon(True)
    t.start()

t.join()

print("All over")

How to solve this or any way to control memory usage?

Upvotes: 3

Views: 4936

Answers (2)

tomfriwel
tomfriwel

Reputation: 2635

With the GhostCat help, I use following code to solve memory usage problem.

import Queue
import threading
import multiprocessing
import time
import psutil


class ThreadSomething(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.queue = queue

    def run(self):
        while True:
            # check available memory
            virtualMemoryInfo = psutil.virtual_memory()
            availableMemory = virtualMemoryInfo.available

            print(str(availableMemory/1025/1024)+"M")

            if availableMemory > MEMORY_WARNING:
                # image from queue
                image = self.queue.get()

                # do something
                doSomething(image)

                # signals to queue job is done
                self.queue.task_done()
            else:
                print("memory warning!")

def doSomething(image):
    # picture processing code, cost time and memory
    print("processing over")

# After testing, there seems no use to create threads more than CPU_COUNT, 
# execution time is not reduce.
CPU_COUNT = multiprocessing.cpu_count()
MEMORY_WARNING = 200*1024*1024  # 200M

images = ["1.png", "2.png", "3.png", "4.png", "5.png"]
queue = Queue.Queue()

def main():
    # spawn a pool of threads, and pass them queue instance
    for i in range(CPU_COUNT):
        t = ThreadSomething(queue)
        t.setDaemon(True)
        t.start()

    # populate queue with data
        for image in images:
            queue.put(image)

    # wait on the queue until everything has been processed
    queue.join()

start = time.time()
main()
print 'All over. Elapsed Time: %s' % (time.time() - start)

I use psutil module to get available memory.

Reference code: yosemitebandit/ibm_queue.py

The code in my question has a problem of creating threads more than CPU_COUNT.

Upvotes: 5

GhostCat
GhostCat

Reputation: 140603

I think you are looking at this from the wrong angle. Your code fires up n threads. Those threads then execute work that you defined for them.

If that work requires them to allocate a lot of memory - what should anything "outside" of that context do about this? What should happen? Should some of the threads be killed? Should somewhere, deep down in C code a malloc ... not happen ... and then?

What I am saying is: your problem is most likely, that you are simply firing up too many of those threads.

Thus the answer is: don't try to fix things after you broke them - better make sure you do not break them at all:

  • do careful profiling, to understand your application; so you can asses how much memory a single thread requires to get its "work" done
  • then change your "main" program to query the hardware it is running on (like: check for available memory and number of physical CPUs that are available)
  • and based on that assessment, start that number of threads that should work given the aforementioned hardware details

Beyond that: this is very common pattern. The developer has a "powerful" machine he is working on; and he implicitly assumes that any target system running his product will have the same or better characteristics. And that is simply not true.

In other words: when you don't know how the hardware looks like your code is running on - then there is only one reasonable thing to do: first acquire that knowledge. To afterwards do different things, based on real data.

Upvotes: 5

Related Questions