user8134203
user8134203

Reputation:

Increasing memory limit in Python?

I am currently using a function making extremely long dictionaries (used to compare DNA strings) and sometimes I'm getting MemoryError. Is there a way to allot more memory to Python so it can deal with more data at once?

Upvotes: 27

Views: 118036

Answers (4)

Mandy
Mandy

Reputation: 173

Python has MemoryError which is the limit of your System RAM util you've defined it manually with resource package.

Defining your class with slots makes the python interpreter know that the attributes/members of your class are fixed. And can lead to significant memory savings!

You can reduce dict creation by python interpreter by using __slot__ . This will tell interpreter to not create dict internally and reuse same variable.

If the memory consumed by your python processes will continue to grow with time. This seems to be a combination of:

  • How the C memory allocator in Python works. This is essentially memory fragmentation, because the allocation cannot call ‘free’ unless the entire memory chunk is unused. But the memory chunk usage is usually not perfectly aligned to the objects that you are creating and using.
  • Using a number of small string to compare data. A process called interning used internally but creating multiple small strings brings load on interpreter.

The best way is to create Worker Thread or single threaded pool to do your work and invalidate worker/kill to free up resources attached/used in worker thread.

Below code creates single thread worker :

__slot__ = ('dna1','dna2','lock','errorResultMap')
lock = threading.Lock()
errorResultMap = []
def process_dna_compare(dna1, dna2):
    with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
        futures = {executor.submit(getDnaDict, lock, dna_key): dna_key for dna_key in dna1}
    '''max_workers=1 will create single threadpool'''
    dna_differences_map={}
    count = 0
    dna_processed = False;
    for future in concurrent.futures.as_completed(futures):
        result_dict = future.result()
        if result_dict :
            count += 1
            '''Do your processing XYZ here'''
    logger.info('Total dna keys processed ' + str(count))

def getDnaDict(lock,dna_key):
    '''process dna_key here and return item'''
    try:
        dataItem = item[0]
        return dataItem
    except:
        lock.acquire()
        errorResultMap.append({'dna_key_1': '', 'dna_key_2': dna_key_2, 'dna_key_3': dna_key_3,
                          'dna_key_4': 'No data for dna found'})
        lock.release()
        logger.error('Error in processing dna :'+ dna_key)
    pass

if __name__ == "__main__":
    dna1 = '''get data for dna1'''
    dna2 = '''get data for dna2'''
    process_dna_compare(dna1,dna2)
    if errorResultMap != []:
       ''' print or write to file the errorResultMap'''

Below code will help you understand memory usage :

import objgraph
import random
import inspect

class Dna(object):
    def __init__(self):
        self.val = None
    def __str__(self):
        return "dna – val: {0}".format(self.val)

def f():
    l = []
    for i in range(3):
        dna = Dna()
        #print “id of dna: {0}”.format(id(dna))
        #print “dna is: {0}”.format(dna)
        l.append(dna)
    return l

def main():
    d = {}
    l = f()
    d['k'] = l
    print("list l has {0} objects of type Dna()".format(len(l)))
    objgraph.show_most_common_types()
    objgraph.show_backrefs(random.choice(objgraph.by_type('Dna')),
    filename="dna_refs.png")

    objgraph.show_refs(d, filename='myDna-image.png')

if __name__ == "__main__":
    main()

Output for memory usage :

list l has 3 objects of type Dna()
function                   2021
wrapper_descriptor         1072
dict                       998
method_descriptor          778
builtin_function_or_method 759
tuple                      667
weakref                    577
getset_descriptor          396
member_descriptor          296
type                       180

More read on slots please visit : https://elfsternberg.com/2009/07/06/python-what-the-hell-is-a-slot/

Upvotes: 1

phnghue
phnghue

Reputation: 1686

Although Python doesn’t limit memory usage on your program, the OS system has a dynamic CPU and RAM limit for every program for a good power and performance of a whole machine.

I work on graphic fractal generate and it required as much billion characters array as possible for speed generate. And I found out there was a soft limit per one python program set by OS that decrease the real performance of the algorithms on machine.

  • When you increase RAM memory and CPU usage (buffer, loop, thread), the total speed of process decrease. The useful thread quantity and thread frequency decrease, it cause the result took longer time to process. But, if you reduced the resource usage to 50...75% of the old config (smaller buffer size, smaller loop, less threads, lower frequency or longher threading timer ) and split the task to multiple part then run multiple console python program to process all task parts in the same time. It will took much less time to finish, and when you check the CPU and RAM usage, it will reach much more higher the old method single python program multi thread.

  • It's mean when we make a program dealing with high performance and speed and giant data processing, we need to design the program that has multiple background program and each background program has multiple thread running. Also, combine disk space to the process, rather than use memory only.

  • Even when you reach max physical memory, dealing with giant data at once will slower and limit application than dealing with giant data by splitting it to several part.

  • Optional, if your program is a special application, please consider to:

    1. Apply Graphic card computing ability to the program, use OpenGL and graphic accelerate for Intel graphic or CUDA for NVIDIA.
    2. Switch to OS and python 64 bit
    3. Switch back to older operating system (but not too old): Windows 7 64bit or older Ubuntu, linux . Because higher OS has more costly luxury features and service consume computer resource. You will see a dramatically improve python ability and operation speed when using SSD drive and switch from windows 11,10 back to 7.
    4. Switch to Safe Mode if you use Windows.
  • By create multiple background programs, it's allow you to create high speed graphic application that not require user to install graphic card. This is also a good choice if you want your program easy distribute to end user that integrate AI and costly compute power. It's cheaper to upgrade RAM in the range 4-16GB than a Graphic card.

    ##//////////////////////////////////
    ##////////// RUNTIME PACK //////////  FOR PYTHON
    ##//
    ##//                                  v2021.08.12 : add Handler
    ##//
    ##//  Module by Phung Phan: [email protected]
    ##
    ##
    import time;
    import threading;
    # var Handler = function(){this.post = function(r){r();};return this;}; Handler.post = function(r){r();}

    ERUN=lambda:0;

    def run(R): R();
    def Run(R): R();
    def RUN(R): R();

    def delay(ms):time.sleep(ms/1000.0); # Control while speed
    def delayF(R,delayMS):
        t=threading.Timer(delayMS/1000.0,R)
        t.start();
        return t;
    def setTimeout(R,delayMS):
        t=threading.Timer(delayMS/1000.0,R)
        t.start();
        return t;
        
    class THREAD:
        def __init__(this):
            this.R_onRun=None;
            this.thread=None;
        def run(this):
            this.thread=threading.Thread(target=this.R_onRun);
            this.thread.start();
        def isRun(this): return this.thread.isAlive();

    AInterval=[];
    class setInterval :
        def __init__(this,R_onRun,msInterval) :
            this.ms=msInterval;
            this.R_onRun=R_onRun;
            this.kStop=False;
            this.kPause=False;
            this.thread=THREAD();
            this.thread.R_onRun=this.Clock;
            this.thread.run();
            this.id=len(AInterval); AInterval.append(this);
        def Clock(this) :
            while not this.kPause :
                this.R_onRun();
                delay(this.ms);
        def pause(this) :
            this.kPause=True;
        def stop(this) :
            this.kPause=True;
            this.kStop=True;
            AInterval[this.id]=None;
        def resume(this) :
            if (this.kPause and not this.kStop) :
                this.kPause=False;
                this.thread.run();
        
    def clearInterval(timer): timer.stop();
    def clearAllInterval():
        for i in AInterval:
            if i!=null: i.stop();

    def cycleF(R_onRun,msInterval):return setInterval(R_onRun,msInterval);
    def stopCycleF(timer):
        if not Instanceof(timer,"String"):
            try: timer.stop();
            except:pass;

    ##########
    ## END ### RUNTIME PACK ##########
    ##########

    import subprocess;

    def process1(): subprocess.call("python process1.py", shell=True);
    def process2(): subprocess.call("python process2.py", shell=True);
    def process3(): subprocess.call("python process3.py", shell=True);
    def process4(): subprocess.call("python process4.py", shell=True);
        
    setTimeout(process1,100);
    setTimeout(process2,100);
    setTimeout(process3,100);
    setTimeout(process4,100);

Upvotes: 0

Tim
Tim

Reputation: 2161

Try to update your py from 32bit to 64bit.

Simply type python in the command line and you will see which your python is. The memory in 32bit python is very low.

Upvotes: -1

cs95
cs95

Reputation: 402713

Python doesn’t limit memory usage on your program. It will allocate as much memory as your program needs until your computer is out of memory. The most you can do is reduce the limit to a fixed upper cap. That can be done with the resource module, but it isn't what you're looking for.

You'd need to look at making your code more memory/performance friendly.

Upvotes: 34

Related Questions