Reputation: 992

How to bypass portion of neural network in TensorFlow for some (but not all) features

In my TensorFlow model I have some data that I feed into a stack of CNNs before it goes into a few fully connected layers. I have implemented that with Keras' Sequential model. However, I now have some data that should not go into the CNN and instead be fed directly into the first fully connected layer because that data contains some values and labels that are part of the input data but that data should not undergo convolutions as it is not image data.

Is such a thing possible with tensorflow.keras or should I do that with tensorflow.nn instead? As far as I understand Keras' sequential models is that the input goes in one end and comes out the other with no special wiring in the middle.

Am I correct that to do this I have to use tensorflow.concat on the data from the last CNN layer and the data that bypasses the CNNs before feeding it into the first fully connected layer?

Upvotes: 1

Answers (2)

Daniel Möller

Reputation: 86600

With a little remodeling and the functional API you can:

#create the CNN - it can also be a sequential
cnn_input = Input(image_shape)
cnn_output = Conv2D(...)(cnn_input)
cnn_output = Conv2D(...)(cnn_output)
cnn_output = MaxPooling2D()(cnn_output)
....

cnn_model = Model(cnn_input, cnn_output)

#create the FC model - can also be a sequential
fc_input = Input(fc_input_shape)
fc_output = Dense(...)(fc_input)
fc_output = Dense(...)(fc_output)

fc_model = Model(fc_input, fc_output)

There is a lot of space for creativity, this is just one of the ways.

#create the full model
full_input = Input(image_shape)
full_output = cnn_model(full_input)
full_output = fc_model(full_output)

full_model = Model(full_input, full_output)

You can use any of the three models in any way you want. They share the layers and the weights, so internally they are the same.

Saving and loading the full model might be quirky. I'd probably save the other two separately and when loading create the full model again.

Notice also that if you save two models that share the same layers, after loading they will probably not share these layers anymore. (Another reason for saving/loading only fc_model and cnn_model, while creating full_model again from code)

Upvotes: 0

meTchaikovsky

Reputation: 7666

Here is an simple example in which the operation is to sum the activations from different subnets:

import keras
import numpy as np
import tensorflow as tf
from keras.layers import Input, Dense, Activation

tf.reset_default_graph()

# this represents your cnn model 
def nn_model(input_x):
    feature_maker = Dense(10, activation='relu')(input_x)
    feature_maker = Dense(20, activation='relu')(feature_maker)
    feature_maker = Dense(1, activation='linear')(feature_maker)
    return feature_maker

# a list of input layers, of course the input shapes can be different
input_layers = [Input(shape=(3, )) for _ in range(2)]
coupled_feature = [nn_model(input_x) for input_x in input_layers]

# assume you take the sum of the outputs 
coupled_feature = keras.layers.Add()(coupled_feature)
prediction = Dense(1, activation='relu')(coupled_feature)

model = keras.models.Model(inputs=input_layers, outputs=prediction)
model.compile(loss='mse', optimizer='adam')

# example training set
x_1 = np.linspace(1, 90, 270).reshape(90, 3)
x_2 = np.linspace(1, 90, 270).reshape(90, 3)
y = np.random.rand(90)

inputs_x = [x_1, x_2]

model.fit(inputs_x, y, batch_size=32, epochs=10)

You can actually plot the model to gain more intuition

from keras.utils.vis_utils import plot_model

plot_model(model, show_shapes=True)

The model of the above code looks like this

Upvotes: 1

How to bypass portion of neural network in TensorFlow for some (but not all) features

Answers (2)

Related Questions