AlwaysNull
AlwaysNull

Reputation: 348

Issue replicating AutoKeras StructuredDataClassifier

I have a model that I generated using AutoKeras and I want to replicate the model so that I can construct it with keras tuner to do further hyperparameter tuning. But I am running into issues replicating the model. The model summary of the autokeras model is:

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 11)]              0         
_________________________________________________________________
multi_category_encoding (Mul (None, 11)                0         
_________________________________________________________________
normalization (Normalization (None, 11)                23        
_________________________________________________________________
dense (Dense)                (None, 16)                192       
_________________________________________________________________
re_lu (ReLU)                 (None, 16)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                544       
_________________________________________________________________
re_lu_1 (ReLU)               (None, 32)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 3)                 99        
_________________________________________________________________
classification_head_1 (Softm (None, 3)                 0         
=================================================================
Total params: 858
Trainable params: 835
Non-trainable params: 23

Layer config

{'batch_input_shape': (None, 11), 'dtype': 'string', 'sparse': False, 'ragged': False, 'name': 'input_1'}
{'name': 'multi_category_encoding', 'trainable': True, 'dtype': 'float32', 'encoding': ListWrapper(['int', 'int', 'int', 'int', 'int', 'int', 'int', 'int', 'int', 'int', 'int'])}
{'name': 'normalization', 'trainable': True, 'dtype': 'float32', 'axis': (-1,)}
{'name': 'dense', 'trainable': True, 'dtype': 'float32', 'units': 16, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'class_name': 'GlorotUniform', 'config': {'seed': None}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
{'name': 're_lu', 'trainable': True, 'dtype': 'float32', 'max_value': None, 'negative_slope': array(0., dtype=float32), 'threshold': array(0., dtype=float32)}
{'name': 'dense_1', 'trainable': True, 'dtype': 'float32', 'units': 32, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'class_name': 'GlorotUniform', 'config': {'seed': None}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
{'name': 're_lu_1', 'trainable': True, 'dtype': 'float32', 'max_value': None, 'negative_slope': array(0., dtype=float32), 'threshold': array(0., dtype=float32)}
{'name': 'dense_2', 'trainable': True, 'dtype': 'float32', 'units': 3, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'class_name': 'GlorotUniform', 'config': {'seed': None}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
{'name': 'classification_head_1', 'trainable': True, 'dtype': 'float32', 'axis': -1}

My training data is a dataframe that's converted to string type with both numerical and categorical data. Since the output is softmax i used LabelBinarizer to convert the target classes.

To make sure the model was replicated properly, i used keras.clone_model to create a copy of the model and try training it myself. But when I tried to train it myself the accuracy does not improve despite hitting 500 epochs.

Is there something that I am missing when it comes to training the model from scratch?

Upvotes: 0

Views: 722

Answers (2)

AlwaysNull
AlwaysNull

Reputation: 348

I finally was able to solve my issue. It's weird but even though the custom multicategory layer did not have params it contained its own mapping for the data. To extend the model and to examine the effect of layer depths I created a new model by adding the multicategory layer from the existing model. Once I did this training accuracy matched AutoKeras.

Edit: Adding code below:

from tensorflow import keras
inputs = keras.layers.Input(shape=(11,), dtype='string')
x = base_model.layers[1](inputs)
x = base_model.layers[2](x)

x = keras.layers.Dense(176)(x)
x = keras.layers.ReLU()(x)
x = keras.layers.Dense(400)(x)
x = keras.layers.ReLU()(x)
x = keras.layers.Dense(464)(x)
x = keras.layers.ReLU()(x)
x = keras.layers.Dense(3)(x)
x = keras.layers.Softmax()(x)

layer_3 = keras.Model(inputs, x)

where the index is for the multicategorical layer.

Upvotes: 0

neel g
neel g

Reputation: 1255

AutoKeras doesn't support any direct conversions - its dependencies are too inbuilt to be isolated from the package itself. The above answer indicates a lack of softmax activation is wrong as there indeed is present:

classification_head_1 (Softm --> probably text got truncated

Next up - do you notice the lack of parameters? 858 is a pretty small number - that is because most layers have 0 parameters - Autokeras uses custom layers which constitute their custom blocks (more about their blocks from their docs)

You can see that to re-create those custom layers, you would need their exact code - which can't be isolated as of the time of writing (though @haifeng-jin is discussing it) since there are specific packages they use to process the input data and what powers their NAS (Neural Architecture search) and the optimization routines they perform.

Unless you can study their code and the implementations of the custom layers and recreate it (which in itself would be quite some work, but not much since the code is already available), it would be a futile attempt if you use keras.clone_model that works with pre-defined keras layers. that would obviously lead to broken models (like the one you have currently).

More importantly, AutoKeras does HyperParameter tuning on its own - if you want to tune your model further, just run AutoKeras for longer period of time to get better results.

tl;dr you can't clone custom layers and blocks with in-package dependencies directly. But if you want to do Hyperparameter tuning, you can run the search for much longer to get a better model.

Upvotes: 1

Related Questions