Christo S. Christov
Christo S. Christov

Reputation: 2309

Input multiple datasets to tensorflow model

Hi I'm trying to input multiple datasets in a model. This is an example of my problem, however in my case one of my models has 2 input parameters while the other one has one. The error I get in my case is :

Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>", "<class 'tensorflow.python.data.ops.dataset_ops.TakeDataset'>"}), <class 'NoneType'>

Code:

import tensorflow as tf

# Create first model
model1 = tf.keras.Sequential()
model1.add(tf.keras.layers.Dense(1))
model1.compile()
model1.build([None,3])

# Create second model
model2 = tf.keras.Sequential()
model2.add(tf.keras.layers.Dense(1))
model2.compile()
model2.build([None,3])


# Concatenate
fusion_model = tf.keras.layers.Concatenate()([model1.output, model2.output])
t = tf.keras.layers.Dense(1, activation='tanh')(fusion_model)
model = tf.keras.models.Model(inputs=[model1.input, model2.input], outputs=t)
model.compile()

#Datasets
ds1 = tf.data.Dataset.from_tensors(([1,2,3],1))
ds2 = tf.data.Dataset.from_tensors(([1,2,3], 2))

print(ds1)
print(ds2)
# Fit
model.fit([ds1,ds2])

Running this example code produces that:

Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'tensorflow.python.data.ops.dataset_ops.TensorDataset'>"}), <class 'NoneType'>

I need to use the dataset modules because they provide in built lazy loading of the data.

Upvotes: 3

Views: 4110

Answers (3)

Phil Wernette
Phil Wernette

Reputation: 51

In case you're interested you can also solve the multiple input issue with tf.data.Dataset.zip() and dictionaries. I recently ran into a similar issue where I needed to input an image and a vector of values into a single model where they would Concatenate mid-model.

I used the tfdata_unzip() function from here to unzip my image tensor from the label tensor that was originally created using the image_dataset_from_directory() function. Then, I re-zipped the dataset together using tf.data.Dataset.zip().

When defining the model I used the Functional API and assigned each input layer a name:

import tensorflow as tf
from tensorflow.keras.layers import *

# create input image layer
in_image = Input(shape=(1305,2457,3), name='input_image')

# create input vector layer
in_vector = Input(shape=(1,), name='input_vector')

My full workflow is similar to the following:

# use tfdata_unzip() to separate input images from labels
input_images, input_labels = tfdata_unzip(input_dat)

# input vector was created using tf.data.Dataset.from_tensor_slices()
# using [1,2,3,4] as a placeholder for my original vector of values
in_vector = tf.data.Dataset.from_tensor_slices([1,2,3,4])

# create a new input Dataset using .zip()
# data is structured as (1) a dictionary of inputs (input_images,in_vector) and (2) their associated labels (input_labels)
model_inputs = tf.data.Dataset.zip(({"input_image":input_images, "input_vector":in_vector}, input_labels))

# if you then wanted to batch, cache, and/or prefetch the dataset you could do so using the following
batchsize = 32
model_inputs = model_inputs.batch(batchsize).cache().prefetch(buffer_size=tf.data.AUTOTUNE)

Then, the model can be fit by calling something similar to:

model.fit(inputs=model_inputs, outputs=predicted_class)

Because model_inputs is a Dataset with labels you do not need to define a y=input_labels in your model.fit() call.

I should also mention that I did the same data re-structuring for validation data and passed it to the model.fit() function by adding validation_data=model_validation_data, where "model_validation_data" is similar to the model_inputs structure.

This is just how I was able to address this issue of multiple inputs into a TF multimodal model. Happy to discuss any issues that arise or other solutions.

Upvotes: 2

Stavros Koureas
Stavros Koureas

Reputation: 1472

I had the same problem while I was trying to fit using two Datasets which build using the text_dataset_from_directory function. For me concatenating the Datasets is not a solution as each Dataset may pass though different Keras layers. So what i did is to build a custom "fit_generator". This will transform the Dataset objects into arrays which Keras supports for multi-input.

def fit_generator(dataset, batch_size):
  X = []
  y = []
  for string_, int_ in dataset.batch(1):
    for i in range(0, len(int_[0])):
      X.append(string_[0][i].numpy())
      y.append(int_[0][i].numpy())
  X_ret = pd.DataFrame(X).to_numpy()
  y_ret = pd.DataFrame(y).to_numpy()
  return X_ret, y_ret

Then you can de-construct the datasets

train_X1, train_y1 = fit_generator(train_ds_1, batch_size)
train_X2, train_y2 = fit_generator(train_ds_2, batch_size)
val_X1, val_y1 = fit_generator(val_ds_1, batch_size)
val_X2, val_y2 = fit_generator(val_ds_2, batch_size)

Then you can make dictionaries with named inputs

train_X = {'Input1': train_X1, 'Input2': train_X2}
train_y = {'Input1': train_y1, 'Input2': train_y2}
val_X = {'Input1': val_X1, 'Input2': val_X2}
val_y = {'Input1': val_y1, 'Input2': val_y2}

Then you can call the fit method like this

model.fit(x=train_X, y=train_y1, validation_data=[val_X,val_y1], batch_size=batch_size, epochs=10)

Upvotes: 0

hydra1
hydra1

Reputation: 66

As noted in the comment, the TensorFlow .fit function in TensorFlow models does not support a list of Datasets.

If you really want to use Datasets, you could use a dictionary as the input, and have named input layers to match to the dict.

Here's how you do it:

model1 = tf.keras.Sequential(name="layer_1")
model2 = tf.keras.Sequential(name="layer_2")
model.summary()

ds1 = tf.data.Dataset.from_tensors(({"layer_1": [[1,2,3]],
                                     "layer_2": [[1,2,3]]}, [[2]]))

model.fit(ds1)

An easier option is to simply use tensors instead of datasets. .fit supports a list of tensors as input so just use that.

model = tf.keras.models.Model(inputs=[model1.input, model2.input], outputs=t)
model.compile(loss='mse')

model.summary()

a = tf.constant([[1, 2, 3]])
b = tf.constant([[1, 2, 3]])

c = tf.constant([[1]])

model.fit([a, b], c)

Upvotes: 4

Related Questions