Reputation: 89
I am relatively new to Machine Learning and Tensorflow, and I want to try and implement mini-batch gradient descent on the MNIST dataset. However, I am not sure how I should implement it.
(Side note: the training images (28px by 28px) and labels are stored in Numpy arrays)
At the moment, I can see 2 different ways to implement it:
My training images are in a Numpy array of [60000,28,28]. Reshape this into a [25 (num batches), 2400 (num images in batch), 28,28] and then use a for loop to call each batch and pass it the model.compile() method. The only thing that I am worried about with this method is that for loops are inherently slow, and a vectorised implementation would be much quicker.
Combine the images and labels into a tensorflow dataset object, and then call the Dataset.batch() method and Dataset.prefetch() method, and then pass the data to the model.compile() method. The only problem with this is that my data doesn't remain as a Numpy array, which I feel have more flexibility than tensorflow dataset objects.
Which of these 2 methods would be best to implement, or is there a third way that is best that I am not aware of?
Upvotes: 5
Views: 4536
Reputation: 438
Keras has an inbuilt batch_size argument to its model.fit method (since you tagged this question with keras I assume that you're using it). I believe that this will probably be the best optimised method to achieve what you're looking for.
Upvotes: 4