Utpal Datta
Utpal Datta

Reputation: 446

Neural Nets for Image + numeric Data

I have a situation where input is an image and a group of (3) numeric fields and output is an image mask. I am not sure about how to do that in KERAS...

My architecture is somewhat like the attachment. I am aware about the CNN and Dense architectures, just not sure how to pass the inputs in the corresponding networks and do the concat operation. Also, suggestion of berrer architecture for this will be great!!!!!

Please suggest me, preferably with example code. Thanks in Advance, Utpal.enter image description here

Upvotes: 3

Views: 1103

Answers (1)

stop-cran
stop-cran

Reputation: 4408

I can advice to try U-net model for this problem. Usual U-net represents several conv and maxpooling layers, and then several conv and upsampling layers:

enter image description here

In the current problem you can mix up non-spatial data (image annotation) at the middle:

enter image description here

Also maybe it's a good idea to start with pre-trained VGG-16 (see below vgg.load_weights(VGG_Weights_path)).

See code below (based on Divam Gupta's repo):

from keras.models import *
from keras.layers import *


def VGGUnet(n_classes, input_height=416, input_width=608, data_length=128, vgg_level=3):
    assert input_height % 32 == 0
    assert input_width % 32 == 0

    # https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5
    img_input = Input(shape=(3, input_height, input_width))
    data_input = Input(shape=(data_length,))

    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input)
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
    f1 = x
    # Block 2
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING)(x)
    f2 = x

    # Block 3
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING)(x)
    f3 = x

    # Block 4
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3', data_format=IMAGE_ORDERING)(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
    f4 = x

    # Block 5
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2', data_format=IMAGE_ORDERING)(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
    f5 = x

    x = Flatten(name='flatten')(x)
    x = Dense(4096, activation='relu', name='fc1')(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    x = Dense(1000, activation='softmax', name='predictions')(x)

    vgg = Model(img_input, x)
    vgg.load_weights(VGG_Weights_path)

    levels = [f1, f2, f3, f4, f5]

    # Several dense layers for image annotation processing
    data_layer = Dense(1024, activation='relu', name='data1')(data_input)
    data_layer = Dense(input_height * input_width / 256, activation='relu', name='data2')(data_layer)
    data_layer = Reshape((1, input_height / 16, input_width / 16))(data_layer)

    # Mix image annotations here
    o = (concatenate([f4, data_layer], axis=1))

    o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
    o = (Conv2D(512, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, f3], axis=1))
    o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
    o = (Conv2D(256, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, f2], axis=1))
    o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
    o = (Conv2D(128, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
    o = (concatenate([o, f1], axis=1))
    o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
    o = (Conv2D(64, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
    o = (BatchNormalization())(o)

    o = Conv2D(n_classes, (3, 3), padding='same', data_format=IMAGE_ORDERING)(o)
    o_shape = Model(img_input, o).output_shape
    output_height = o_shape[2]
    output_width = o_shape[3]

    o = (Reshape((n_classes, output_height * output_width)))(o)
    o = (Permute((2, 1)))(o)
    o = (Activation('softmax'))(o)
    model = Model([img_input, data_input], o)
    model.outputWidth = output_width
    model.outputHeight = output_height

    return model

To train and evaluate a keras model with several inputs prepare separate arrays for each of the input layers - image_train and annotation_train (preserving an order by the first axis, i.e. number of the sample) and call this:

model.fit([image_train, annotation_train], result_segmentation_train, batch_size=..., epochs=...)

test_loss, test_acc = model.evaluate([image_test, annotation_test], result_segmentation_test)

Good luck!

Upvotes: 2

Related Questions