Andrew Spott
Andrew Spott

Reputation: 3627

memory efficient data structures in python

I have a large number of identical dictionaries (identically structured: same keys, different values), which leads to two different memory problems:

What is a good way that I can share the labels (so each label is not stored in the object), and compress the memory?

Upvotes: 2

Views: 2451

Answers (1)

intellimath
intellimath

Reputation: 2504

It may be offer the following solution to the problem based on the recordclass library:

pip install recordclass

>>> from recordclass import make_dataclass

For given set of labels you create a class:

>>> DataCls = make_dataclass('DataCls', 'first second third')
>>> data = DataCls(first="red", second="green", third="blue")
>>> print(data)
DataCls(first="red", second="green", third="blue")
>>> print('Memory size:', sys.getsizeof(data), 'bytes')
Memory size: 40 bytes

It fast and takes minimum memory. Suitable for creating millions of instances.

The downside: it's C-extension and not in standard library. But available on pypi.

Addition: Starting recordclass 0.15 version there is an option fast_new for faster instance creation:

>>> DataCls = make_dataclass('DataCls', 'first second third', fast_new=True)

If one don't need keyword arguments then instance creation will be accelerated twice. Starting 0.22 this is default behavior and option fast_new=Truecan be omitted.

P.S.: the author of the recordclass library is here.

Upvotes: 2

Related Questions