Alessandro
Alessandro

Reputation: 794

Unable to perform Grid Search for models receiving more than one input (Keras)

I have created final_model that receives two inputs (sequences of length 8). Each of them is processed by two different models, model_A and model_B. Then the two outputs are merged returning the input of model_C, which finally returns the output of the whole model.

This is the graphical overview

enter image description here

and this is the code:

    model_A = models.Sequential()
    model_A.add(layers.Dense(16, activation='relu',input_shape=(n,)))
    model_A.add(layers.Dense(3))

    model_B = models.Sequential()
    model_B.add(layers.Dense(16, activation='relu',input_shape=(n,)))
    model_B.add(layers.Dense(3))

    concatenated = layers.concatenate([model_A.output, model_B.output])
    model_C = layers.Dense(16, activation='relu')(concatenated)
    out = layers.Dense(3, activation='softmax')(model_C)

    final_model = models.Model([model_A.input, model_B.input], out)

Everything works fine when I fit my model:

    opt = keras.optimizers.Adam(learning_rate=0.001)
    final_model.compile(optimizer=opt,loss='categorical_crossentropy',metrics ['accuracy'])
    history = final_model.fit([X_train,x_train], y_train, epochs=500, batch_size=1000)

However, I cannot use grid search to optimize the hyper-parameters. Indeed, with the following code

    batch_size = [10, 20]
    epochs = [10, 50]
    param_grid = dict(batch_size=batch_size, epochs=epochs)
    grid = GridSearchCV(estimator=final_model, param_grid=param_grid, n_jobs=-1, 
    cv=3,scoring="accuracy")
    grid_result = grid.fit([X_train,x_train], y_train)

I get this error:

Found input variables with inconsistent numbers of samples: [2, 40000]

Note that the shape of both X_train and x_train is [40000,8].

Is there a way for using grid search in the case of multiple inputs?

Upvotes: 1

Views: 843

Answers (3)

user2246849
user2246849

Reputation: 4407

Note that tf.keras.wrappers.scikit_learn.KerasClassifier mentioned by the answer linked by @Sean is now deprecated. The current way to do this is using scikeras. The models are wrapped in using subclasses of BaseWrapper (more details here).

Here is a toy example to demonstrate how you could do it with your model:

import numpy as np
from tensorflow.keras import models, layers, Input
from tensorflow import keras
from sklearn.base import BaseEstimator
from sklearn.preprocessing import FunctionTransformer
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV
from scikeras.wrappers import KerasClassifier, BaseWrapper


# Splits X and returns the two inputs.
def split_input(X):
    # Get back 2x(40000, 8).
    return [X[:, :int(np.shape(X)[1]/2)], X[:, int(np.shape(X)[1]/2):]]


# Wrapper estimator that will call split_input via the FunctionTransformer.
# This will split the input before feeding them to the model.
# Instead of BaseWrapper, you can also subclass scikeras.wrappers.KerasClassifier (see documentation for differences).
class MultiInputEstimator(BaseWrapper):
    @property
    def feature_encoder(self):
        return FunctionTransformer(func=split_input)


    # Score passed to the grid search.
    @staticmethod
    def scorer(y_true, y_pred, **kwargs):
        return accuracy_score(np.argmax(y_true, axis=1), np.argmax(y_pred, axis=1))


# Should return the fully compiled model. Pass eventual parameters here.
def get_model(input_shape, n_dense_1):
    model_A = models.Sequential()
    model_A.add(layers.Dense(n_dense_1, activation='relu',input_shape=input_shape))
    model_A.add(layers.Dense(3))

    model_B = models.Sequential()
    model_B.add(layers.Dense(n_dense_1, activation='relu',input_shape=input_shape))
    model_B.add(layers.Dense(3))

    concatenated = layers.concatenate([model_A.output, model_B.output])
    model_C = layers.Dense(16, activation='relu')(concatenated)
    out = layers.Dense(3, activation='softmax')(model_C)

    final_model = models.Model([model_A.input, model_B.input], out)
    
    opt = keras.optimizers.Adam(learning_rate=0.001)
    final_model.compile(optimizer=opt,loss='categorical_crossentropy', metrics=['accuracy'])
    
    return final_model
X1 = np.random.random((40000, 8))
X2 = np.random.random((40000, 8))
# Stick the two datasets together. `split_input` will take care of separating them.
X = np.hstack([X1, X2]) # (40000, 16) important: of course X.shape[0] should match y.shape[0].

y = np.zeros((40000, 3))
y[0:20000, 0] = 1
y[20000:30000, 1] = 1
y[30000:, 2] = 1

# get_model is called to return the fully compiled model which is wrapped in our MultiInputEstimator instance.
clf = MultiInputEstimator(model=get_model, model__input_shape=(int(X.shape[1]/2),), model__n_dense_1=16)

params = {'model__n_dense_1': [16, 32 ,128]}

grid = GridSearchCV(estimator=clf, param_grid=params, cv=5, verbose=True)
grid_res = grid.fit(X=X, y=y)

The only new elements here are the MultiInputEstimator wrapper and the split_input function. The idea is to trick the grid search into having a single input dataset by merging the inputs. Then, we use a scikeras.wrappers.BaseWrapper to run a FunctionTransformer on the input data that will split the dataset before feeding it to the model.

Accessing the model:

Since the model is now "wrapped" in clf, we can access it via clf.model_ once it has been built. This happens when we fit it, e.g., clf.fit(X, y) or initialize it (clf.initialize(X, y)). If instead of a build function like get_model we pass a Keras Model instance, the model will be available directly. An example for plotting the model:

from tensorflow import keras

clf.initialize(X, y)
keras.utils.plot_model(clf.model_, show_shapes=True)

model plot

Upvotes: 3

razimbres
razimbres

Reputation: 5015

Try this solution , using keras.wrappers.scikit_learn and KerasClassifier, you just need to reshape data:

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow as tf
models=tf.keras
num_classes = 10
input_shape = (28, 28, 1)

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
x_train = np.expand_dims(x_train, -1).reshape(-1,28,28,)
x_test = np.expand_dims(x_test, -1).reshape(-1,28,28,)
X_train=x_train


y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

def create_model():
    model_A = models.Sequential()
    model_A.add(layers.Dense(16, activation='relu',input_shape=(28,28,1)))
    model_A.add(layers.Dense(10))
    model_A.add(layers.Flatten())

    model_B = models.Sequential()
    model_B.add(layers.Dense(16, activation='relu',input_shape=(28,28,1)))
    model_B.add(layers.Dense(10))
    model_B.add(layers.Flatten())

    concatenated = layers.concatenate([model_A.output, model_B.output])
    model_C = layers.Dense(16, activation='relu')(concatenated)
    out = layers.Dense(10, activation='softmax')(model_C)

    final_model = models.Model([model_A.input, model_B.input], out)
    final_model.compile(loss="categorical_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=0.001), metrics=["accuracy"])
    return final_model

#opt = 
from keras.wrappers.scikit_learn import KerasClassifier
model = KerasClassifier(build_fn=create_model)

history = model.fit([X_train,x_train], y_train,epochs=500,batch_size=1000)

from sklearn.model_selection import GridSearchCV

batch_size = [300, 600]
epochs = [100, 200]
param_grid = dict(batch_size=batch_size, epochs=epochs)

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3,scoring="accuracy")
grid_result = grid.fit(np.array([X_train,x_train]).reshape(-1,28,28,2), y_train)  

Upvotes: 1

Sean
Sean

Reputation: 552

Grid Search for Keras with multiple inputs - This might answer your question.

On the other hand, there are other special purpose hyperparameter search libraries such as ray.

Upvotes: 1

Related Questions