systempuntoout
systempuntoout

Reputation: 74114

How to a split a list in parts that have size less than 1MByte

I have a sorted list of dictionaries returned by a remote API call (tipically the response is less than 4 MByte.
I would like to split this list in chunks where the MAX allowed size of the resulted single chunk is 1 MByte.*

The resulted list of chunks need to preserve the initial sorting; these chunks then will be serialized (via Pickle) and put into different Blob field having 1 MByte MAX size.

What's the fastest code to achieve that with Python 2.5?

*the number of chunks should be the lowest that fits into the 1MByte constraint

Upvotes: 2

Views: 305

Answers (2)

systempuntoout
systempuntoout

Reputation: 74114

I found pympler library, the asizeof module provides basic size information for one or several Python objects tested with Python 2.2.3, 2.3.7, 2.4.5, 2.5.1, 2.5.2, 2.6.

Upvotes: 0

Manuel Salvadores
Manuel Salvadores

Reputation: 16525

Following up on my comment. You could use this extension. And the following script. Assume that this won't optimize the size of the chunks. It only assures that none of them are larger than MAX

from sizeof import asizeof

matrix=[]
new_chunk = []
size_of_current_chunk = 0
for x in your_sorted_list:
    s = asize(x)
    if size_of_current_chunk + s > MAX:
        matrix.append(new_chunk)
        size_of_current_chunk = 0
        new_chunk = []
    size_of_chunk += s
    new_chunk.append(x)

if len(new_chunk):
    matrix.append(new_chunk)

the element matrix would contain lists of objects with less than MAX bytes in each of them.

It'd be interesting to measure the performance of asize against just encoding the objects in a json string and multiplying the json string by sizeof(char).

Upvotes: 1

Related Questions