Reputation: 219
I have a python array of shape (19, 73984) - this represents 19 gray flatten images of 272 x 272px size. I want to be able to process this and feed it into a feed forward neural network but I want to be able to feed it in batches.
I expect to have some kind of a function that will be ran in a for loop. That function should receive the dataset array, batch size and also the iteration's index value in order to know how many items should return and from which position.
ex:
def get_batch_data(i, dataset, batch_size):
where i
is a for loop iteration index that will be used to return a chunk of data starting for a certain position until the dataset
is looped over.
Is there a better way to do this or can you help me with this?
Thanks!
Upvotes: 0
Views: 1337
Reputation: 51673
Testdata:
bigArr = [[x,x+1,x+2,x+3] for x in range(1,1000,4) ] # 250 subLists
Easiest would probably be islice()
from itertools
:
print(list(itertools.islice(bigArr,5,10)))) # start 5, stop 10, implicit 1 step
Doku: islice() which takes your list, a start
value , an stop
value and a stepper
- and does what you want as one-liner.
You could also leverage itertools.compress
with a sliding True
window for the elements you want:
# only show 5 to 10th (excluded) element:
varParts = itertools.compress(bigArr, # first list
[1 if x in range(5,10) else 0 for x in range(len(bigArr))]) # second list
# consume iterator:
print(list(varParts))
Compress
only returns values from the first list that evaluate to True
in the second list - the second list is build in a way that only the wanted elements evaluate to True
Doku: compress
Or do all by hand using slicing for the big array like this:
def get_batch_data(i, arr, batchSize):
return arr[i:min(len(arr),i+batchSize)]
Use like this:
for i in range(0,len(bigArr),5):
print(get_batch_data(i,bigArr,5)) # creates sub-slices - wich take memory
Upvotes: 2