Reputation: 95
Optuna TPESampler and RandomSampler try the same suggested integer values (possible floats and loguniforms as well) for any parameter more than once for some reason. I couldn't find a way to stop it from suggesting same values over over again. Out of 100 trials quite a few of them are just duplicates. Unique suggested value count ends up around 80-90 out of 100 trials. If I include more parameters for tuning, say 3, I even see all 3 of them getting the same values a few times in 100 trials.
It's like this. 75 for min_data_in_leaf was used 3 times:
[I 2020-11-14 14:44:05,320] Trial 8 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:07,876] Trial 9 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:10,447] Trial 10 finished with value: 45831.75933279074 and parameters: {'min_data_in_leaf': 43}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:13,502] Trial 11 finished with value: 46125.39810101329 and parameters: {'min_data_in_leaf': 4}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:16,547] Trial 12 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.
Example code below:
def lgb_optuna(trial):
rmse = []
params = {
"seed": 42,
"objective": "regression",
"metric": "rmse",
"verbosity": -1,
"boosting": "gbdt",
"num_iterations": 1000,
'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 1, 100)
}
cv = StratifiedKFold(n_splits=5, random_state=42, shuffle=False)
for train_index, test_index in cv.split(tfd_train, tfd_train[:,-1]):
X_train, X_test = tfd_train[train_index], tfd_train[test_index]
y_train = X_train[:,-2].copy()
y_test = X_test[:,-2].copy()
dtrain = lgb.Dataset(X_train[:,:-2], label=y_train)
dtest = lgb.Dataset(X_test[:,:-2], label=y_test)
booster_gbm = lgb.train(params, dtrain, valid_sets=dtest, verbose_eval=False)
y_predictions = booster_gbm.predict(X_test[:,:-2])
final_mse = mean_squared_error(y_test, y_predictions)
final_rmse = np.sqrt(final_mse)
rmse.append(final_rmse)
return np.mean(rmse)
study = optuna.create_study(sampler=TPESampler(seed=42), direction='minimize')
study.optimize(lgb_optuna, n_trials=100)
Upvotes: 8
Views: 4850
Reputation: 1
As mentioned I would also suggest to make experiments with different samplers and their hyperparameters e.g. TPESampler (seed=i,multivariate=True). Hyperparameters sometimes improve the optimization, time-wise and even the final outcome. Try independent and relative sampling as well. In addition, try modifying the search space, optimize in search blocks. Design your optimization experiment so that you can gradually narrow down the search space around what should be the minimum (in case RMSE is your evaluation metric). In large search space the algorithm may be able to escape local minima, this may be one reason why it does not suggest you new values. modifying learning rate may also help to escape local minima. Hope I helped, good luck.
Upvotes: 0
Reputation: 155
I have my objective function check study.trials_dataframe() if these parameters have been run before and then just return study.trials_dataframe().value if they have.
Upvotes: 2
Reputation: 4629
The problem is your sampler specified in this line:
study = optuna.create_study(sampler=TPESampler(seed=42), direction='minimize')
TPESampler
is not a uniform sampler. It's a different sampler that tries to sample from promising range of values. See details here and here. That's the reason why you are seeing a lot of duplicates. For the optimizator they are promising values, and then they are explored further, maybe in different combinations.
To make a real uniform sampling, you should change your sampler to:
sampler=RandomSampler(seed)
This will not assure you that there will be no duplicates, but the values will be more equally distributed
If you want to ensure that you search for only different combinations, you should use GridSampler
. As stated from the doc:
the trials suggest all combinations of parameters in the given search space during the study.
Upvotes: 7