basic
basic

Reputation: 43

Memory efficient and fastest way to create batchs and applying fuction to python list

I need to create batch of 5 and apply function on each value of the list in memory efficient and fastest way. I need to avoid two steps to do the batching. I need to do it in a single step in most efficient and fatest way. Please help in doing it in a efficient way.

Sample code:

import json
from uuid import uuid4

rows = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

rowlist = [r*r for r in rows]

records = [{'Data': json.dumps(rowlist[i:i + 5]), 'PartitionKey': str(uuid4())} for i in range(0, len(rowlist), 5)]

print(records)

Results:

[{'Data': '[1, 4, 9, 16, 25]', 'PartitionKey': '73ba1cba-248c-4b26-982e-1d902627bfe6'}, {'Data': '[36, 49, 64, 81, 100]', 'PartitionKey': '02a986bf-0495-4620-a3d4-0f0b91cd24d6'}, {'Data': '[121, 144, 169, 196, 225]', 'PartitionKey': 'a0ef674e-95f3-4cb0-8e0b-ad052f7726bf'}]

Upvotes: 0

Views: 276

Answers (1)

Mark
Mark

Reputation: 92440

If you're having memory issues, you can try to maintain everything as an iterator until the last possible moment. Then you can iterate over each item one at a time, or call list(records) if you want to try to make a final list:

(I removed the json and uuid to make the structure clearer):

rows = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

sqr = lambda r: r*r

records = ({'Data': group} for group in zip(*[map(sqr,rows)] * 5))

for record in records:
    print(record)

Prints:

{'Data': (1, 4, 9, 16, 25)}
{'Data': (36, 49, 64, 81, 100)}
{'Data': (121, 144, 169, 196, 225)}

Upvotes: 1

Related Questions