Do the number of units in a layer need to be defined within a conditional scope when using keras tuner to setup a model?

Question

According to the Keras Tuner examples here and here, if you want to define the number of layers and each layer's units in a deep learning model using hyper parameters you do something like this:

for i in range(hp.Int('num_layers', 1, 10)):
    model.add(layers.Dense(units=hp.Int('unit_' + str(i), 32, 512, 32)))

However, as others have noted here and here after the oracle has seen a model with num_layers = 10 it will always assign a value to unit_0 through unit_9, even when num_layers is less than 10.

In the case that num_layers = 1 for example, only unit_0 will be used to build the model. But, unit_1 through unit_9 will be defined and active in the hyper parameters.

Does the oracle "know" that unit_1 through unit_9 weren't actually used to build the model (and therefore disregard their relevance for impacting the results of that trial)?

Or, does it assume unit_1 through unit_9 are being used because they have been defined (and calling hp.get('unit_9') for example will return a value)?

In the latter case the oracle is using misinformation to drive the tuning process. As a result it will take longer to converge (at best) and incorrectly converge to a solution as a result of assigning relevance to the unused hyper parameters (at worst).

Should the model actually be defined using conditional scopes, like this?

num_layers = hp.Int('num_layers', 1, 10)
for i in range(num_layers):
    with hp.conditional_scope('num_layers', list(range(i + 1, 10 + 1))):
        model.add(layers.Dense(units=hp.Int('unit_' + str(i), 32, 512, 32)))

When defining the model like this, if num_layers < 10, calling hp.get('unit_9') will return a ValueError: Conditional parameter unit_10 is not currently active, as expected.

yixing · Accepted Answer

Using conditional scope is the best as it correctly recognizes active parameters. Without using conditional scope it is, at least at the moment, not possible to let the tuner know what parameters are actually used.

However, when using RandomSearch the simpler way (that allows inactive parameters to be there) the result should be exactly the same. When starting a new trial the tuner will go through all possibilities, but will reject the invalid ones before actually starting the trial.

For the existing tuners I think only Bayesian is strongly affected by this. I am not 100% sure about the case of Hyperband; but for RandomSearch the two approaches are exactly the same (except for displaying inactive parameters that make people confused).

Do the number of units in a layer need to be defined within a conditional scope when using keras tuner to setup a model?

Answers (1)

Related Questions