Jiaji Huang
Jiaji Huang

Reputation: 331

model ensemble with shared layers

In keras, I want to train an ensemble of models that share some layers. They are of the following form:

x ---> f(x) ---> g_1(f(x))

x ---> f(x) ---> g_2(f(x))

...

x ---> f(x) ---> g_n(f(x))

here f(x) are some nontrivial shared layers. g_1 through g_n have their specific parameters.

At each training stage, data x, is fed into one of the n networks, say, the i-th. A loss on g_i(f(x)) is then minimized/decreased via gradient based optimizer. How could I define and train such a model?

Thanks in advance!

Upvotes: 5

Views: 4313

Answers (1)

indraforyou
indraforyou

Reputation: 9099

You can easily do so by using functional Model.

A small example .. you can build on it:

import numpy as np
from keras.models import Model
from keras.layers import Dense, Input

X = np.empty(shape=(1000,100))
Y1 = np.empty(shape=(1000))
Y2 = np.empty(shape=(1000,2))
Y3 = np.empty(shape=(1000,3))

inp = Input(shape=(100,))
dense_f1 = Dense(50)
dense_f2 = Dense(20)

f = dense_f2(dense_f1(inp))

dense_g1 = Dense(1)
g1 = dense_g1(f)

dense_g2 = Dense(2)
g2 = dense_g2(f)

dense_g3 = Dense(3)
g3 = dense_g3(f)


model = Model([inp], [g1, g2, g3])
model.compile(loss=['mse', 'binary_crossentropy', 'categorical_crossentropy'], optimizer='rmsprop')

model.summary()

model.fit([X], [Y1, Y2, Y3], nb_epoch=10)

Edit:

Based on your comments, you can always make different models and write the training loop yourself based on how you need your training. You can see in the model.summary() all the models are sharing the initial layers. Here is the extension to the example

model1 = Model(inp, g1)
model1.compile(loss=['mse'], optimizer='rmsprop')
model2 = Model(inp, g2)
model2.compile(loss=['binary_crossentropy'], optimizer='rmsprop')
model3 = Model(inp, g3)
model3.compile(loss=['categorical_crossentropy'], optimizer='rmsprop')
model1.summary()
model2.summary()
model3.summary()

batch_size = 10
nb_epoch=10
n_batches = X.shape[0]/batch_size


for iepoch in range(nb_epoch):
    for ibatch in range(n_batches):
        x_batch = X[ibatch*batch_size:(ibatch+1)*batch_size]
        if ibatch%3==0:
            y_batch = Y1[ibatch*batch_size:(ibatch+1)*batch_size]
            model1.train_on_batch(x_batch, y_batch)      
        elif ibatch%3==1:
            y_batch = Y2[ibatch*batch_size:(ibatch+1)*batch_size]
            model2.train_on_batch(x_batch, y_batch)      
        else:
            y_batch = Y3[ibatch*batch_size:(ibatch+1)*batch_size]
            model3.train_on_batch(x_batch, y_batch)      

Upvotes: 10

Related Questions