Reputation: 912
I want to perform Hyperparameter Optimization on my Keras Model. The problem is the dataset is quite big, normally in training I use fit_generator
to load the data in batch from disk, but the common package like SKlearn Gridsearch, Talos, etc. only support fit
method.
I tried to load the whole data to memory, by using this:
train_generator = train_datagen.flow_from_directory(
original_dir,
target_size=(img_height, img_width),
batch_size=train_nb,
class_mode='categorical')
X_train,y_train = train_generator.next()
But the when performing gridsearch, the OS kills it because of large memory usage. I also tried to undersampling my dataset to only 25%, but it's still too big.
Anyone has experience in the same scenario with me? Can you please share your strategy to perform Hyperparameter Opimization for large dataset?
From the answer of @dennis-ec, I tried to follow a tutorial of SkOpt in here: http://slashtutorial.com/ai/tensorflow/19_hyper-parameters/ and it was a very comprehensive tutorial
Upvotes: 3
Views: 2311
Reputation: 2156
In my opinion GridSearch is not a good method for hyperparameter optimization, espacially in Deep Learning where you have many hyperparameters.
I would recommend bayesian hyper parameter optimization. Here is a tutorial how to implement this, using skopt. As you can see you need to write a function which does your training and return your validation score to optimize on, so the API does not care if you use fit or fit_generator from keras.
Upvotes: 2
Reputation: 1246
See this question: how use grid search with fit generator in keras
The first answer seems answer your question.
Upvotes: 1