Boomer
Boomer

Reputation: 392

Keras-tuner gets stuck after arbitrary number of trials using WSL

I recently had to switch to using WSL 2 in order to enable GPU computing with Tensorflow. I had a piece of code working on windows (CPU) that tunes three hyperparameters using keras-tuner. However, when switching to WSL 2, suddenly, after X trials, the tuning stops:

State of keras-tuner

Sometimes it gets stuck after 3 trials, sometimes 2. It doesn't crash, it just stays at epoch 1/3, hence no error message. This is the tuning code (which worked on windows).

tuner = kt.Hyperband(build_tune_model,
                     objective='val_accuracy',
                     max_epochs=25,
                     factor=3,
                     directory='keras_tuner',
                     project_name='hyperband_tune')

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
tensorboard_callback = tf.keras.callbacks.TensorBoard("/tmp/tb_logs")

tuner.search(
    ds_train,
    epochs=400,
    validation_data=ds_valid,
    steps_per_epoch=train_size // BATCH_SIZE,
    validation_steps=valid_size,
    callbacks=[early_stopping, tensorboard_callback]
    )

Things I have tried:

  1. Waiting for an hour (previous trials took less than 4 min to complete)
  2. Using a different tuners (Bayesian optimization and HyperBand)
  3. Checking performance stats of PC. It seems the CPU is being used (53%), while the GPU is completely unused (makes sense as its not starting the epoch). The CPU usage drops when interrupting the kernel.

PC performance stats

Extra info:

Worth noting: I get the following warnings when loading Tensorflow, that I did not get before switching over to WSL 2: enter image description here

Could one of these warnings be the issue? Also, I use Jupyter Notebook, I will try to run my code from the terminal and see if that works.

Any help I would greatly appreciate. If there is any info missing, please let me know. Thank you in advance.

Upvotes: 0

Views: 110

Answers (0)

Related Questions