phg
phg

Reputation: 315

How to work around Python 3 maximum string size?

On a 64 bit Python build with a sys.maxsize of 9223372036854775807 the interpreter nevertheless throws a MemoryError if I allocate a string of more than INT_MAX - 512 MiB chars:

$ python3
#Python 3.6.6 (default, Jul 19 2018, 14:25:17) 
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "*" * 2684354560
>>> s = "*" * 2684354561
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

(The limit is the same for bytes whose element type is definitely 8 bits.) There is plenty of free memory and swap so I am certain the system is not hitting any physical limits.

What is happening here and how can I increase this cap?

Upvotes: 2

Views: 4340

Answers (1)

phg
phg

Reputation: 315

Resolution: turns out to be the data segment size limit

$ ulimit -d
4194304

For some reason, these 4294967296 B translate to a 2684354560 B per-allocation cap in Python.

Setting this value to unlimited removes the cap. This can be done externally by the parent process (e. g. ulimit -d unlimited from the shell) or in Python itself using the wrapper library for resource.h:

resource.setrlimit (resource.RLIMIT_DATA,
                    (resource.RLIM_INFINITY
                    ,resource.RLIM_INFINITY))

Apparently on more recent kernels (4.7 and later) RLIMIT_DATA affects anonymous mappings too which explains both the observed failure of large-ish allocations and my being surprised.

Upvotes: 1

Related Questions