Reputation: 446
I have a situation where input is an image and a group of (3) numeric fields and output is an image mask. I am not sure about how to do that in KERAS...
My architecture is somewhat like the attachment. I am aware about the CNN and Dense architectures, just not sure how to pass the inputs in the corresponding networks and do the concat operation. Also, suggestion of berrer architecture for this will be great!!!!!
Please suggest me, preferably with example code.
Thanks in Advance, Utpal.
Upvotes: 3
Views: 1103
Reputation: 4408
I can advice to try U-net model for this problem. Usual U-net represents several conv and maxpooling layers, and then several conv and upsampling layers:
In the current problem you can mix up non-spatial data (image annotation) at the middle:
Also maybe it's a good idea to start with pre-trained VGG-16 (see below vgg.load_weights(VGG_Weights_path)
).
See code below (based on Divam Gupta's repo):
from keras.models import *
from keras.layers import *
def VGGUnet(n_classes, input_height=416, input_width=608, data_length=128, vgg_level=3):
assert input_height % 32 == 0
assert input_width % 32 == 0
# https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5
img_input = Input(shape=(3, input_height, input_width))
data_input = Input(shape=(data_length,))
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
f1 = x
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING)(x)
f2 = x
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING)(x)
f3 = x
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
f4 = x
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
f5 = x
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
vgg = Model(img_input, x)
vgg.load_weights(VGG_Weights_path)
levels = [f1, f2, f3, f4, f5]
# Several dense layers for image annotation processing
data_layer = Dense(1024, activation='relu', name='data1')(data_input)
data_layer = Dense(input_height * input_width / 256, activation='relu', name='data2')(data_layer)
data_layer = Reshape((1, input_height / 16, input_width / 16))(data_layer)
# Mix image annotations here
o = (concatenate([f4, data_layer], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(512, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f3], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(256, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f2], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(128, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f1], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(64, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = Conv2D(n_classes, (3, 3), padding='same', data_format=IMAGE_ORDERING)(o)
o_shape = Model(img_input, o).output_shape
output_height = o_shape[2]
output_width = o_shape[3]
o = (Reshape((n_classes, output_height * output_width)))(o)
o = (Permute((2, 1)))(o)
o = (Activation('softmax'))(o)
model = Model([img_input, data_input], o)
model.outputWidth = output_width
model.outputHeight = output_height
return model
To train and evaluate a keras model with several inputs prepare separate arrays for each of the input layers - image_train
and annotation_train
(preserving an order by the first axis, i.e. number of the sample) and call this:
model.fit([image_train, annotation_train], result_segmentation_train, batch_size=..., epochs=...)
test_loss, test_acc = model.evaluate([image_test, annotation_test], result_segmentation_test)
Good luck!
Upvotes: 2