Fluxy
Fluxy

Reputation: 2978

Loop through mini batches of numpy array?

I have a numpy array that contains 813698 rows:

len(df_numpy)
Out[55]: 813698

I want to loop through this array using mini batches of 5000.

mini_batch = 5000
i = 0
for each batch in df_numpy:
   mysubset = df_numpy[i:mini_batch+i]
   # …
   i = i + mini_batch

The problem is that (len(df_numpy)-1)/mini_batch is not an integer. So, the last mini batch is not equal to 5000.

How can I loop though df_numpy so that all records of df_numpy are included?

Upvotes: 0

Views: 1256

Answers (1)

Tonechas
Tonechas

Reputation: 13743

This code should get the job done:

mini_batch = 5000
for first in range(0, len(df_numpy), mini_batch):
    mysubset = df_numpy[first:first+mini_batch]
    # ...

Demo

In [2]: import numpy as np

In [3]: df_numpy = np.arange(13)

In [4]: mini_batch = 5

In [5]: for first in range(0, len(df_numpy), mini_batch):
   ...:    mysubset = df_numpy[first:first+mini_batch]
   ...:    print(mysubset)
[0 1 2 3 4]
[5 6 7 8 9]
[10 11 12]

Upvotes: 2

Related Questions