Reputation: 181
I have built a Pytorch model and performed hyperparameters tuning using library Hyperopt. The result obtained is not reproducible despite I have already call the below seeding function at the beginning of each run:
util.py
def seed_everything(seed=42):
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
np.random.seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
Trainer.py
def train(self, params):
util.seed_everything()
#wrote my training code here
Upon futher inspection, I have found that the first result of hyperopt tuning is always reproducible, but the subsequent runs are not. This is unexpected for me as I have already called seed_everthing() at the beginning of the train function.
In addition, if I run the training as below:
for i in range(2):
print ("in iteration ", i)
trainer = Trainer(**configs)
trainer.train(params)
The result of iteration 1 and 2 is different to each other, but they are always the same(i.e. iteration 1 always gives train_loss of 1.31714 while iteration 2 always gives 4.31235)
I am hoping that iteration 1 and iteration 2 would gives the same result since it should be reproducible.
Upvotes: 4
Views: 720
Reputation: 401
It seems like the seeds you have set in start that was not linked to hyperopt fmin function. It can be done by introducing np.random.default_rng parameter as explained below, also do not forget to define the random_state parameter in the model.
fmin(fn=objective, space=space, algo=tpe.suggest, max_evals=100,
trials=trials,rstate=np.random.default_rng(42))
It should work; however, if it does not then mostly problem lies with parallelisation, so try setting n_jobs = -1
, then debug parallelisation parameters accordingly.
Upvotes: 0