best way to process large data in chunks

Question

I have like for example more then 20000 records . and data is something like this

data = [{'id': 1} , {'id':2} , {'id':3} .... 20000]

now I want to upload this data in chunks of 1000 . so what is best way to do this in 1000 chunk which will give least overhead .

Daniel Trugman · Accepted Answer

The best way is to use generators. These are generic iterators that allow you traverse objects using custom behaviours.

In your case, an easy solution is to use range which returns a generator of any specific size, for example:

range(1, len(data), 1000)

Will generate the values 1, 1001, 2001, ...

If you use that in a loop, you can then pass the specific range to a handler method, for example:

batch = 1000
for i in range(1, len(data), batch):
  handle(data[i:i+batch])

Answers (2)