WL_Law
WL_Law

Reputation: 87

Tensorflow Model I/O question: Failed to find data adapter that can handle input

This is from the Keras document example: Train a model to calculate the priority_score and which department to forward for an email.

I implement the model in another way, I can compile it but I cannot train the model. I guess it a model I/O issue, i.e I need to feed the correct format of the I/O data.

ValueError: Failed to find data adapter that can handle input: (<class 'dict'> containing {"<class 'str'>"} keys and {"<class 'numpy.ndarray'>", '(<class \'list\'> containing values of types {"<class \'str\'>"})'} values), (<class 'dict'> containing {"<class 'str'>"} keys and {"<class 'numpy.ndarray'>"} values)

Its too long so I didn't put it into this post's title.

My model has 3 inputs:

The output are:

Questions

Can anyone tell me what's wrong in my code?

And generally speaking, how should I think about the I/O of a model? Such as this case. I thought preparing N strings, such as 800 strings, and 800 tags is OK. But I keep getting errors. Well I solved most of them but couldn't overcome this one. Please share your experiences. Thanks!

Appendix

Model summary

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
tags_input (InputLayer)         [(None, 12)]         0                                            
__________________________________________________________________________________________________
flatten (Flatten)               (None, 12)           0           tags_input[0][0]                 
__________________________________________________________________________________________________
title_input (InputLayer)        [(None, 1)]          0                                            
__________________________________________________________________________________________________
body_input (InputLayer)         [(None, 1)]          0                                            
__________________________________________________________________________________________________
dense (Dense)                   (None, 500)          6500        flatten[0][0]                    
__________________________________________________________________________________________________
text_vectorization (TextVectori (None, 500)          0           title_input[0][0]                
__________________________________________________________________________________________________
text_vectorization_1 (TextVecto (None, 500)          0           body_input[0][0]                 
__________________________________________________________________________________________________
tf_op_layer_ExpandDims (TensorF [(None, 500, 1)]     0           dense[0][0]                      
__________________________________________________________________________________________________
embedding (Embedding)           (None, 500, 100)     1000100     text_vectorization[0][0]         
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, 500, 100)     1000100     text_vectorization_1[0][0]       
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 500, 100)     200         tf_op_layer_ExpandDims[0][0]     
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 500, 300)     0           embedding[0][0]                  
                                                                 embedding_1[0][0]                
                                                                 dense_1[0][0]                    
__________________________________________________________________________________________________
priority (Dense)                (None, 500, 1)       301         concatenate[0][0]                
__________________________________________________________________________________________________
departments (Dense)             (None, 500, 4)       1204        concatenate[0][0]                
==================================================================================================
Total params: 2,008,405
Trainable params: 2,008,405
Non-trainable params: 0
__________________________________________________________________________________________________

Full Code

def MultiInputAndOutpt():
  max_features = 10000
  sequnce_length = 500
  embedding_dims = 100
  num_departments = 4
  num_tags = 12

  str = "hello"
  title_vect = TextVectorization(max_tokens=max_features, output_mode="int", output_sequence_length=sequnce_length)
  body_vect = TextVectorization(max_tokens=max_features, output_mode="int", output_sequence_length=sequnce_length)

  title_input = keras.Input(shape=(1,), dtype=tf.string, name="title_input")
  x1 = title_vect(title_input)
  x1 = layers.Embedding(input_dim=max_features + 1, output_dim=embedding_dims)(x1)

  body_input = keras.Input(shape=(1,), dtype=tf.string, name="body_input")
  x2 = body_vect(body_input)
  x2 = layers.Embedding(input_dim=max_features + 1, output_dim=embedding_dims)(x2)

  tags_input = keras.Input(shape=(num_tags,), name="tags_input")
  x3 = layers.Flatten()(tags_input)
  x3 = layers.Dense(500)(x3)
  x3 = tf.expand_dims(x3, axis=-1)
  x3 = layers.Dense(100)(x3)

  x = layers.concatenate([x1, x2, x3])

  priority_score = layers.Dense(1)(x)
  priority_score = tf.reshape(priority_score, (-1, 1), name="priority")
  departments = layers.Dense(num_departments)(x)
  departments = tf.reshape(departments, (-1, num_departments), name="departments")

  model = keras.Model(inputs=[title_input, body_input, tags_input], outputs=[priority_score, departments])
  model.summary()

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.1),
                loss=[keras.losses.BinaryCrossentropy(from_logits=True),
                      keras.losses.CategoricalCrossentropy(from_logits=True)],
                loss_weights=[1.0, 0.2],
                )

  # title_data = np.random.randint(num_words, size=(1280, 10))
  # body_data = np.random.randint(num_words, size=(1280, 100))
  alphabet = np.array(list(string.ascii_lowercase + ' '))
  title_data = np.random.choice(alphabet, size=(800, 1000))
  body_data = np.random.choice(alphabet, size=(800, 1000))
  tags_data = np.random.randint(2, size=(800, num_tags)).astype("float32")

  body_data = ["".join(body_data[i]) for i in range(len(body_data))]
  title_data = ["".join(title_data[i]) for i in range(len(title_data))]

  # Dummy target data
  priority_targets = np.random.random(size=(800, 1))
  dept_targets = np.random.randint(2, size=(800, num_departments))

  model.fit(
    {"title_input": title_data, "body_input": body_data, "input3": tags_data},
    {"priority": priority_targets, "departments": dept_targets},
    epochs=2,
    batch_size=32, )

Upvotes: 1

Views: 1553

Answers (1)

WL_Law
WL_Law

Reputation: 87

I figured out by myself:

Input

The inputs of the model are correct. I don't need the flatten though.

tags_input (InputLayer)         [(None, 12)]          0                                            
__________________________________________________________________________________________________
title_input (InputLayer)        [(None, 1)]          0                                            
__________________________________________________________________________________________________
body_input (InputLayer)         [(None, 1)]          0                                            
__________________________________________________________________________________________________

output

The outputs are not correct, I don't want (None, 500, 1) and (None, 500, 4) as the output. I only need 1 priority score and the 1 department list of 4 values.

To change the shape from (None, 500, 1) to (None, 1) I need to drop some values. There are many ways do this, here I chose to drop the middle dim directly.

...
  departments = layers.Dense(num_departments)(x)  # Shape: (None, 500, 4)
  departments = tf.slice(departments, [0, 0, 0], [-1, 1, 4]) # Shape (None, 1, 4)
  departments = tf.squeeze(departments, [1]) # Shape (None, 4) but its not a squeeze type
  departments = layers.Dense(num_departments, name="departments")(departments) # Shape (None, 4)
...

Same to the priority_score output.

And now the outputs become

priority_score (Dense)          (None, 1)            2           tf.compat.v1.squeeze[0][0]       
__________________________________________________________________________________________________
departments (Dense)             (None, 4)            20          tf.compat.v1.squeeze_1[0][0]     

Train model

The next step is to prepare the training data. What we need is to construct

  • title data: N strings, shape (N, 1). here 1 represent a python string.
  • body data: Same as tiele data
  • tags data: N array of floats, shape (N, 12)

Targets:

  • priority_score: N floats, shape (N, 1)
  • department: N array of floats, shape (N, 4)

where N can be any number.

Then we call the fit function:

  model.fit(
    {"title_input": title_data, "body_input": body_data, "tags_input": tags_data},
    {"priority_score": priority_targets, "departments": dept_targets, },
    epochs=50,
    batch_size=64, )

Surprisingly the loss keeps growing:

Epoch 1/50
157/157 [==============================] - 5s 28ms/step - loss: 1.3467 - priority_score_loss: 0.6938 - departments_loss: 3.2644 - priority_score_acc: 0.0000e+00 - departments_acc: 0.1267
Epoch 2/50
157/157 [==============================] - 4s 27ms/step - loss: 4.6381 - priority_score_loss: 0.6976 - departments_loss: 19.7023 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2483
Epoch 3/50
157/157 [==============================] - 4s 28ms/step - loss: 16.9411 - priority_score_loss: 0.6984 - departments_loss: 81.2137 - priority_score_acc: 0.0000e+00 - departments_acc: 0.1569
Epoch 4/50
157/157 [==============================] - 5s 29ms/step - loss: 23.8020 - priority_score_loss: 0.7075 - departments_loss: 115.4721 - priority_score_acc: 0.0000e+00 - departments_acc: 0.1427
Epoch 5/50
157/157 [==============================] - 5s 29ms/step - loss: 1.8650 - priority_score_loss: 0.7046 - departments_loss: 5.8019 - priority_score_acc: 0.0000e+00 - departments_acc: 0.1995
Epoch 6/50
157/157 [==============================] - 5s 30ms/step - loss: 3.0613 - priority_score_loss: 0.7025 - departments_loss: 11.7943 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2472
Epoch 7/50
157/157 [==============================] - 5s 30ms/step - loss: 5.2455 - priority_score_loss: 0.7032 - departments_loss: 22.7114 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2402
Epoch 8/50
157/157 [==============================] - 5s 30ms/step - loss: 6.0378 - priority_score_loss: 0.7013 - departments_loss: 26.6828 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2418
Epoch 9/50
157/157 [==============================] - 5s 30ms/step - loss: 10.8300 - priority_score_loss: 0.7033 - departments_loss: 50.6334 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2465
Epoch 10/50
157/157 [==============================] - 4s 27ms/step - loss: 12.1005 - priority_score_loss: 0.7019 - departments_loss: 56.9929 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2627
Epoch 11/50
157/157 [==============================] - 4s 27ms/step - loss: 15.8248 - priority_score_loss: 0.6983 - departments_loss: 75.6328 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2513
Epoch 12/50
157/157 [==============================] - 5s 29ms/step - loss: 19.3059 - priority_score_loss: 0.6940 - departments_loss: 93.0596 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2386
Epoch 13/50
157/157 [==============================] - 5s 29ms/step - loss: 32.6499 - priority_score_loss: 0.6937 - departments_loss: 159.7808 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2526
Epoch 14/50
157/157 [==============================] - 4s 28ms/step - loss: 31.1433 - priority_score_loss: 0.6936 - departments_loss: 152.2486 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2499
Epoch 15/50
157/157 [==============================] - 5s 29ms/step - loss: 41.9199 - priority_score_loss: 0.6932 - departments_loss: 206.1338 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2362
Epoch 16/50
157/157 [==============================] - 5s 30ms/step - loss: 40.2069 - priority_score_loss: 0.6931 - departments_loss: 197.5692 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2300
Epoch 17/50
157/157 [==============================] - 5s 30ms/step - loss: 60.4129 - priority_score_loss: 0.6932 - departments_loss: 298.5986 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2425
Epoch 18/50
157/157 [==============================] - 5s 30ms/step - loss: 75.8330 - priority_score_loss: 0.6932 - departments_loss: 375.6990 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2332
Epoch 19/50
157/157 [==============================] - 5s 29ms/step - loss: 81.5731 - priority_score_loss: 0.6931 - departments_loss: 404.4002 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2568
Epoch 20/50
157/157 [==============================] - 4s 28ms/step - loss: 103.4053 - priority_score_loss: 0.6932 - departments_loss: 513.5608 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2409
Epoch 21/50
157/157 [==============================] - 4s 28ms/step - loss: 106.4842 - priority_score_loss: 0.6932 - departments_loss: 528.9552 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2584
Epoch 22/50
157/157 [==============================] - 4s 28ms/step - loss: 121.2103 - priority_score_loss: 0.6932 - departments_loss: 602.5854 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2332
Epoch 23/50
157/157 [==============================] - 5s 29ms/step - loss: 139.4970 - priority_score_loss: 0.6932 - departments_loss: 694.0189 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2421
Epoch 24/50
157/157 [==============================] - 5s 29ms/step - loss: 180.7346 - priority_score_loss: 0.6933 - departments_loss: 900.2067 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2449
Epoch 25/50
157/157 [==============================] - 4s 28ms/step - loss: 201.8011 - priority_score_loss: 0.6932 - departments_loss: 1005.5396 - priority_score_acc: 0.0000e+00 - departments_acc: 0.2420
Epoch 26/50

I guess this is because the training data is randomly generated, and the model is not well constructed. Anyway, we can train model and predict with some data now.

This was good learning experience.

Upvotes: 1

Related Questions