charlesreid1
charlesreid1

Reputation: 4821

Running a Python process with limited memory

We have a Python command line utility that makes API calls to download large files, and we would like to see how the program behaves when it is run on machines that have limited memory (~1 GB or less).

One option to do this would be to use a virtualization layer (Docker container or Vagrant virtual box) to create an OS with a specified amount of memory. But I am interested in a different approach.

I am wondering if there is a way to throttle the amount of memory available to a Python process, so that I could run the command line interface to make the API calls, but limit it to a maximum of (say) 512 MB of memory to test for out-of-memory issues.

Looking for solutions for either Mac OS X (running 10.14+) or Linux.

Upvotes: 4

Views: 6842

Answers (2)

charlesreid1
charlesreid1

Reputation: 4821

Using setrlimit to set maximum memory size

As per @Alexis Drakopoulos's answer, the resource module can be used to set the maximum amount of virtual memory used by a Python script, with the caveat that this approach only works on Linux-based systems, and does not work on BSD-based systems like Mac OS X.

To modify the limit, add the following call to setrlimit in your Python script:

resource.setrlimit(resource.RLIMIT_AS, (soft_lim, hard_lim))

(where the soft/hard limits are units of bytes - and their values are usually equal).

Quick example

Here is a quick example that limits the memory to about 1 KB (1000 bytes), then fails to import pandas due to a memory error:

$ python
Python 3.6.9 (default, Nov  7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import resource

>>> print(resource.getrlimit(resource.RLIMIT_AS))
(-1, -1)

>>> resource.setrlimit(resource.RLIMIT_AS, (1000,1000))

>>> import pandas as pd
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vagrant/.local/lib/python3.6/site-packages/pandas/__init__.py", line 11, in <module>
  File "/home/vagrant/.local/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
  File "/home/vagrant/.local/lib/python3.6/site-packages/numpy/core/__init__.py", line 24, in <module>
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 674, in exec_module
  File "<frozen importlib._bootstrap_external>", line 779, in get_code
  File "<frozen importlib._bootstrap_external>", line 487, in _compile_bytecode
MemoryError
MemoryError

Why doesn't this work on BSD systems?

If you check the man page for getrlimit or setrlimit, you'll see a list of RLIMIT_* variables - but the list is different between BSD and Linux. The Linux getrlimit/setrlimit man page lists RLIMIT_AS, but the BSD getrlimit/setrlimit man page does not list any RLIMIT variable for controlling the amount of memory. So, even though resource.RLIMIT_AS is defined in the resource module on Mac OS X, setting it has no effect on the kernel or on the amount of memory available to the process.

Also see What do the two numbers returned by Python's resource.RLIMIT_VMEM (or resource.RLIMIT_AS) mean?

Upvotes: 4

Alexis Drakopoulos
Alexis Drakopoulos

Reputation: 1145

If you really want to do this in Python itself you could try taking a look at resource doc which has ways of using setrlimit() to set limits to resources listed there, of which memory is one of. Not sure if that's helpful.

Upvotes: 1

Related Questions