RektAngle
RektAngle

Reputation: 102

Multi image input in keras

I tried to implement pilotnet model using keras. Using sequential model i was able implement 1 image CNN but how do we input 3 images into a CNN network in keras.

'''

def createModel(): model = Sequential()

model.add(Convolution2D(24, (5, 5), (2, 2), input_shape=(66, 200, 3), activation='relu'))
model.add(Convolution2D(36, (5, 5), (2, 2), activation='relu'))
model.add(Convolution2D(48, (5, 5), (2, 2), activation='relu'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(100, activation = 'relu'))
model.add(Dense(50, activation = 'relu'))
model.add(Dense(10, activation = 'relu'))

model.add(Dense(1))
model.compile(Adam(lr=0.0001),loss='mse')
return model

'''

This implementation was only for center camera image but how do i feed in the left as well as the right camera image into the model such that i get only 1 output i.e., my steering angle.Model I'm trying to implement

Upvotes: 1

Views: 2232

Answers (2)

Suesarn Wilainuch
Suesarn Wilainuch

Reputation: 21

In my opinion, I think the right approach for your problem is that you should use Keras functional API as it is convenient and suitable for designing complex models or for multi-input or multi-output models, but for your case that requires multiple inputs. In addition to model design, for Keras, the image feed is also involved. But I'm going to skip this point because I think you already know-how.

I have provided a modeling example with Keras functional API for your problem. I'm referring to the picture you attached, you can modify the model structure yourself. which from the picture I understand that all 3 CNNs do not share the weight.

I've shown examples of both methods, with the first declaring all 3 CNNs merged into a single model at the same time.

The second is the 3 CNNs It has the same structure, so they can be built separately and can be combined into a new model for convenience.

First method

import tensorflow as tf

def createModel():
  image_shape = (66, 200, 3)

  input_img_center = tf.keras.Input(image_shape)
  input_img_right = tf.keras.Input(image_shape)
  input_img_left = tf.keras.Input(image_shape)

  # CNN for center 
  f1 = tf.keras.layers.Conv2D(24, (5, 5), (2, 2), activation='relu')(input_img_center)
  f1 = tf.keras.layers.Conv2D(36, (5, 5), (2, 2), activation='relu')(f1)
  f1 = tf.keras.layers.Conv2D(48, (5, 5), (2, 2), activation='relu')(f1)
  f1 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(f1)
  f1 = tf.keras.layers.Flatten()(f1)

  # CNN for right
  f2 = tf.keras.layers.Conv2D(24, (5, 5), (2, 2), activation='relu')(input_img_right)
  f2 = tf.keras.layers.Conv2D(36, (5, 5), (2, 2), activation='relu')(f2)
  f2 = tf.keras.layers.Conv2D(48, (5, 5), (2, 2), activation='relu')(f2)
  f2 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(f2)
  f2 = tf.keras.layers.Flatten()(f2)

  # CNN for left
  f3 = tf.keras.layers.Conv2D(24, (5, 5), (2, 2), activation='relu')(input_img_left)
  f3 = tf.keras.layers.Conv2D(36, (5, 5), (2, 2), activation='relu')(f3)
  f3 = tf.keras.layers.Conv2D(48, (5, 5), (2, 2), activation='relu')(f3)
  f3 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(f3)
  f3 = tf.keras.layers.Flatten()(f3)

  # concatenate feature vector from 3 view
  f = tf.keras.layers.concatenate([f1, f2, f3])

  # create whatever layer you want (in this example, I followed by an additional fully connected layer)
  f = tf.keras.layers.Dense(100, activation = 'relu')(f)
  f = tf.keras.layers.Dense(50, activation = 'relu')(f)
  f = tf.keras.layers.Dense(10, activation = 'relu')(f)

  output = tf.keras.layers.Dense(1)(f)

  model = tf.keras.Model([input_img_center, input_img_right, input_img_left], [output])

  opt = tf.keras.optimizers.Adam(learning_rate=0.0001)
  model.compile(optimizer=opt, loss='mse')    

  model.summary()
  return model

model = createModel()

Second method

def createModel():
  image_shape = (66, 200, 3)

  input_img = tf.keras.Input(image_shape)

  f = tf.keras.layers.Conv2D(24, (5, 5), (2, 2), activation='relu')(input_img)
  f = tf.keras.layers.Conv2D(36, (5, 5), (2, 2), activation='relu')(f)
  f = tf.keras.layers.Conv2D(48, (5, 5), (2, 2), activation='relu')(f)
  f = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(f)
  f = tf.keras.layers.Flatten()(f)

  model = tf.keras.Model([input_img], [f])
  model.summary()
  return model


def createCombineModel(center_model, right_model, left_model):
  image_shape = (66, 200, 3)

  input_img_center = tf.keras.Input(image_shape)
  input_img_right = tf.keras.Input(image_shape)
  input_img_left = tf.keras.Input(image_shape)

  f1 = center_model(input_img_center)
  f2 = right_model(input_img_right)
  f3 = left_model(input_img_left)

  # concatenate feature vector from 3 view
  f = tf.keras.layers.concatenate([f1, f2, f3])

  # create whatever layer you want (in this example, I followed by an additional fully connected layer)
  f = tf.keras.layers.Dense(100, activation = 'relu')(f)
  f = tf.keras.layers.Dense(50, activation = 'relu')(f)
  f = tf.keras.layers.Dense(10, activation = 'relu')(f)

  output = tf.keras.layers.Dense(1)(f)

  model = tf.keras.Model([input_img_center, input_img_right, input_img_left], [output])

  opt = tf.keras.optimizers.Adam(learning_rate=0.0001)
  model.compile(optimizer=opt, loss='mse')    

  model.summary()
  return model

center_model = createModel()
right_model = createModel()
left_model = createModel()

createCombineModel(center_model, right_model, left_model)

Upvotes: 1

Vishwas Chepuri
Vishwas Chepuri

Reputation: 731

You can directly change input_shape to (N, 66, 200, 3). So while training and testing the input batch shape should be (B, N, 66, 200, 3) where B is batch size and N is num_views in your case three (center, left, right).

This is the model summary with input_shape = (3, 66, 200, 3).

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_15 (Conv2D)           (None, 3, 31, 98, 24)     1824      
_________________________________________________________________
conv2d_16 (Conv2D)           (None, 3, 14, 47, 36)     21636     
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 3, 5, 22, 48)      43248     
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 3, 3, 20, 64)      27712     
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 3, 1, 18, 64)      36928     
_________________________________________________________________
flatten_3 (Flatten)          (None, 3456)              0         
_________________________________________________________________
dense_12 (Dense)             (None, 100)               345700    
_________________________________________________________________
dense_13 (Dense)             (None, 50)                5050      
_________________________________________________________________
dense_14 (Dense)             (None, 10)                510       
_________________________________________________________________
dense_15 (Dense)             (None, 1)                 11        
=================================================================
Total params: 482,619
Trainable params: 482,619
Non-trainable params: 0
_________________________________________________________________

Upvotes: 2

Related Questions