Jonny Joker
Jonny Joker

Reputation: 73

How to use tensorflow dataset correctly for multiple input layers with keras

I have multiple input layers (20 input layers) and I want to use a tf.dataset for feeding the model. The batch_size is 16. Unfortunately model.fit(train_dataset, epochs=5) is throwing the following error:

ValueError: Error when checking model input: the list of numpy arrays that you are passing to your model is not the size the model expected. Expected to see 20 array(s), for inputs ['input_2', ... , 'input_21'] but instead got the following list of 1 arrays: [<tf.Tensor 'args_0:0' shape=(None, 20, 512, 512, 3) dtype=int32>]...

I assume, that keras wants a shape like (20,None,512,512,3). Has someone an idea to this problem or how to use tf.datasets correctly for a model with multiple input layers?

def read_tfrecord(bin_data):
    for i in feature_map_dict:
        label_seq[i] = tf_input_feature_selector(feature_map_dict[i])
    img_seq = {'images': tf.io.FixedLenSequenceFeature([], dtype=tf.string)}
    cont, seq = tf.io.parse_single_sequence_example(serialized=bin_data, context_features=label_seq, sequence_features=img_seq)
    image_raw = seq['images']
    images = decode_image_raw(image_raw)    
    images = tf.reshape(images, [20,512,512,3])
    images = preprocess_input(images)
    label = cont["label"]
    return images, label

def get_dataset(tfrecord_path):
    dataset = tf.data.TFRecordDataset(filenames=tfrecord_path)
    dataset = dataset.map(read_tfrecord)
    dataset = dataset.prefetch(buffer_size=AUTOTUNE)
    dataset = dataset.batch(BATCH_SIZE)
    return dataset

def create_model():
    nets =[]
    inputs=[]
    # Set up base model
    base_ResNet50 = ResNet50(weights='imagenet', include_top= False, input_shape=(512, 512, 3))    
    for images_idx in list(range(0,20)):
        x = Input(shape=(512,512,3))
        inputs.append(x)
        x = base_ResNet50(x)
        nets.append(x)
    maxpooling = tf.reduce_max(nets, [0])
    flatten = Flatten()(maxpooling)
    dense_1 = Dense(10,activation='sigmoid')(flatten)
    predictions = Dense(1,activation='sigmoid')(dense_1)
    model = Model(inputs=inputs, outputs=predictions)
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Thanks in advance.

With a small modification of Niteya's idea the test toy-model runs the training. Great!

But I am still not happy with this solution, because all 20 images belong to one object and so far I understand this solution I have to create 21 tfrecords. By that, the informations of one object will be distributed overall these files. I would like to have a more easy solution, where all the informations of an object are in only one tfrecord.

This testing toy-model works!!!

import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Input, Flatten, Dense
from tensorflow.keras.models import Model

x_1 = Input(shape=(100,100,3))
x_2 = Input(shape=(100,100,3))
inputs = [x_1,x_2]
flatten_1 = Flatten()(x_1)
flatten_2 = Flatten()(x_2)
dense_1 = Dense(50,activation='sigmoid')
d1_1 = dense_1(flatten_1)
d1_2 = dense_1(flatten_2)
nets =[d1_1,d1_2]
maxpooling = tf.reduce_max(nets, [0])
d2 = Dense(10,activation='sigmoid')(maxpooling)
predictions = Dense(1,activation='sigmoid')(d2)
model = Model(inputs=inputs, outputs=predictions)

model.compile(loss='binary_crossentropy', optimizer='adam',
        metrics=['accuracy'])

for layer in model.layers:
    print(layer.name)

input_d = tf.data.Dataset.zip(tuple(tf.data.Dataset.from_tensors(tf.random.normal([16,100,100,3])) for i in range(2))) 
output = tf.data.Dataset.from_tensors(tf.ones(16))
dataset = tf.data.Dataset.zip((input_d, output))

model.fit(dataset,epochs=5)

Using Niteya's second idea with the function tf.split is a good solution. Niteya, thank you very much.

inputs = Input(shape=(20,512,512,3))
for x in tf.split(inputs,num_or_size_splits=20, axis=1):
        x = tf.reshape(x,[-1,512,512,3])
        x = base_ResNet50(x)
        nets.append(x)```  
and 

BATCH_SIZE=1 model.fit(train_dataset, steps_per_epoch=10, epochs=5)

Upvotes: 1

Views: 2560

Answers (1)

Niteya Shah
Niteya Shah

Reputation: 1824

Have you considered using tf.data.Dataset.zip? Your model needs to be fed 20 different inputs so zip them together, then zip that dataset with the output, which also needs to be zipped.

I am using random inputs but you should get the method from it.

    input_d = tf.data.Dataset.zip(tuple(tf.data.Dataset.from_tensors(tf.random.normal([16, 512,512,3])) for i in range(20))) 
    output = tf.data.Dataset.from_tensors(tf.ones(16))
    dataset = tf.data.Dataset.zip((input_d, output))

https://www.tensorflow.org/api_docs/python/tf/data/Dataset#zip

Edit: Using split, something like this could be done. Pass the entire the dataset, and then split it(You may need to use axis).

    for i in tf.split(tf_record_Input, 20):
        x = base_ResNet50(i)
        nets.append(x)

Upvotes: 6

Related Questions