model ensemble with shared layers

Question

In keras, I want to train an ensemble of models that share some layers. They are of the following form:

x ---> f(x) ---> g_1(f(x))

x ---> f(x) ---> g_2(f(x))

...

x ---> f(x) ---> g_n(f(x))

here f(x) are some nontrivial shared layers. g_1 through g_n have their specific parameters.

At each training stage, data x, is fed into one of the n networks, say, the i-th. A loss on g_i(f(x)) is then minimized/decreased via gradient based optimizer. How could I define and train such a model?

Thanks in advance!

indraforyou · Accepted Answer

You can easily do so by using functional Model.

A small example .. you can build on it:

import numpy as np
from keras.models import Model
from keras.layers import Dense, Input

X = np.empty(shape=(1000,100))
Y1 = np.empty(shape=(1000))
Y2 = np.empty(shape=(1000,2))
Y3 = np.empty(shape=(1000,3))

inp = Input(shape=(100,))
dense_f1 = Dense(50)
dense_f2 = Dense(20)

f = dense_f2(dense_f1(inp))

dense_g1 = Dense(1)
g1 = dense_g1(f)

dense_g2 = Dense(2)
g2 = dense_g2(f)

dense_g3 = Dense(3)
g3 = dense_g3(f)


model = Model([inp], [g1, g2, g3])
model.compile(loss=['mse', 'binary_crossentropy', 'categorical_crossentropy'], optimizer='rmsprop')

model.summary()

model.fit([X], [Y1, Y2, Y3], nb_epoch=10)

Edit:

Based on your comments, you can always make different models and write the training loop yourself based on how you need your training. You can see in the model.summary() all the models are sharing the initial layers. Here is the extension to the example

model1 = Model(inp, g1)
model1.compile(loss=['mse'], optimizer='rmsprop')
model2 = Model(inp, g2)
model2.compile(loss=['binary_crossentropy'], optimizer='rmsprop')
model3 = Model(inp, g3)
model3.compile(loss=['categorical_crossentropy'], optimizer='rmsprop')
model1.summary()
model2.summary()
model3.summary()

batch_size = 10
nb_epoch=10
n_batches = X.shape[0]/batch_size


for iepoch in range(nb_epoch):
    for ibatch in range(n_batches):
        x_batch = X[ibatch*batch_size:(ibatch+1)*batch_size]
        if ibatch%3==0:
            y_batch = Y1[ibatch*batch_size:(ibatch+1)*batch_size]
            model1.train_on_batch(x_batch, y_batch)      
        elif ibatch%3==1:
            y_batch = Y2[ibatch*batch_size:(ibatch+1)*batch_size]
            model2.train_on_batch(x_batch, y_batch)      
        else:
            y_batch = Y3[ibatch*batch_size:(ibatch+1)*batch_size]
            model3.train_on_batch(x_batch, y_batch)

model ensemble with shared layers

Answers (1)

Related Questions