gog
gog

Reputation: 11347

Sparse array of dicts - efficient representation

In one of my programs I use a sparse array of data, which is currently implemented as an integer-indexed dict like this:

{
   0: {some dict with data},
   1: {some similar but yet different dict},
   10: {...},
   100: {...},
   200: {...},
   etc
}

It turned out that this dict takes too much memory for my purposes. Is there a way to store sparse arrays more efficiently? I'm ready to sacrifice access time milliseconds for the sake of less memory consumption. The key range is 0..0xFFFFFF, the sparseness is about 30%.

Although a 3rd party module might be an option, I'm more interested in a pure python solution.

Thanks.

To clarify, inner dicts are not subject to optimisation, I'm only trying to arrange them in a better way. For simplicity, let's pretend I have strings rather than dicts there:

data = {
   0: "foo",
   1: "bar",
   10: "...",
   100: "...",
   200: "...",
   etc
}

Upvotes: 2

Views: 282

Answers (1)

aquavitae
aquavitae

Reputation: 19114

If the structure is mapping, then a dict-like object is really the right option, and if memory is an issue then the obvious solution is to work off a file instead. The easiest approach may be use a pandas Series, which can used as dict and can work directly through an HDF5 file (see http://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables)

Alternatively, for a pure python solution, you could use the shelve module.

Upvotes: 3

Related Questions