Max Teiger
Max Teiger

Reputation: 145

Keras : Value error : setting an array element with a sequence

Context :

I am just starting in Deep Learning and I have to implement a model in Python that can detect the inference between two sentences (label is neutral, contradiction or entailment). The data set is formatted as follows:

| index   | sentence_1   | sentence_2   | label  |
|------------------------------------------------|
| 1       | blabla       | blabla       | neutral|

To accomplish this task, I chose to use Keras which seemed relatively easy to use. I encoded the sentences with Glove embedded vectors (dim=50) and then padded them with maxlen=80.

I end up with a new panda dataframe :

| index   | sentence_1_padded   | sentence_2_padded   | label        | target |
|--------------------------------------------------------------------|--------|
| 1       | matrix 80*50        | matrix 80*50        | neutral      | 2      |
| ...     | ...                 | ...                 |  ...         |  ...   |
| 5000    | matrix 80*50        | matrix 80*50        | contradiction| 0      |

Each embedded vector and each sequence is a numpy array.

I want to train my model using this transformed dataset.

So I built this model :

vocab_size = len(glove_wordmap)+1
X = dataset_processed[['sentence_1_padded', 'sentence_2_padded']]
y = dataset_processed[['target']]

inputs = Input(shape=(2,))
embedding_layer = Embedding(vocab_size, 128)(inputs)
x = LSTM(64)(embedding_layer)
x = Dense(32, activation='relu')(x)

predictions = Dense(3, activation='softmax')(x)

model = Model(inputs=[inputs], outputs=predictions)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['acc'])

model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)

and I have the following error that keeps coming back:

Train on 3750 samples, validate on 1250 samples
Epoch 1/5
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-248-ddab59c5a43a> in <module>()
----> 1 model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)

4 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
   1176                                         steps_per_epoch=steps_per_epoch,
   1177                                         validation_steps=validation_steps,
-> 1178                                         validation_freq=validation_freq)
   1179 
   1180     def evaluate(self,

/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py in fit_loop(model, fit_function, fit_inputs, out_labels, batch_size, epochs, verbose, callbacks, val_function, val_inputs, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps, validation_freq)
    202                     ins_batch[i] = ins_batch[i].toarray()
    203 
--> 204                 outs = fit_function(ins_batch)
    205                 outs = to_list(outs)
    206                 for l, o in zip(out_labels, outs):

/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs)
   2977                     return self._legacy_call(inputs)
   2978 
-> 2979             return self._call(inputs)
   2980         else:
   2981             if py_any(is_tensor(x) for x in inputs):

/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in _call(self, inputs)
   2915                 array_vals.append(
   2916                     np.asarray(value,
-> 2917                                dtype=tf.as_dtype(tensor.dtype).as_numpy_dtype))
   2918         if self.feed_dict:
   2919             for key in sorted(self.feed_dict.keys()):

/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: setting an array element with a sequence.

I have read this post, but it doesn't help since I've checked the dimensions and the type of elements in X, and each sequence in each column is a numpy array of shape (80,50).

I would really be grateful to the person who has a solution to my problem, or to someone who knows a beginner's tutorial to solve this kind of problem easily.

Thank you for your help !

PS: Feel free to tell me if I'm off to a bad start to solve this problem

Upvotes: 0

Views: 627

Answers (1)

thushv89
thushv89

Reputation: 11333

You're trying to solve a problem of sentence entailment. This means that you need to have two streams of network flows in your graph (i.e. one for each sentence). The main problem is that you have defined an Input layer of size (None,2). But your input has a sequence length of 80 (probably has the size (None, 80, 50, 2).

Another problem is that, your sentence_1_padded and sentence_2_padded needs to be (5000,80), not (5000,80,50). Because your embedding layer expects word IDs (not the GloVe embeddings). If you want the GloVe embeddings, you need to initialize your Embedding layer with the GloVe vectors.

Therefore, you need to do the following changes.

  • Define two input layers of size (80,)
  • Define two streams of outputs for each input and later concatenate them to get a single prediction.
  • Instead of having sentence_1_padded and setence_2_padded as elements of 80x50 (after performing embedding lookup), you need to have them as word IDs of sequence length (80)
  • After doing these changes when you do data['sentence_1_padded'] for example, it should return a (5000, 80) matrix.
# Toy data (dataset size 250)
# X1 = np.random.randint(0,100,size=(250,80))
# X2 = np.random.randint(0,100,size=(250,80))
# y = np.random.choice([0,1,2], size=(250,))

X = [dataset_processed['sentence_1_padded'], dataset_processed['sentence_2_padded']]

inputs1 = Input(shape=(80,))
inputs2 = Input(shape=(80,))
embedding_layer = Embedding(vocab_size, 128)
emb_out1 = embedding_layer(inputs1)
emb_out2 = embedding_layer(inputs2)

lstm_layer = LSTM(64)
x1 = lstm_layer(emb_out1)
x2 = lstm_layer(emb_out2)

dense = Dense(32, activation='relu')
x1 = dense(x1)
x2 = dense(x2)

x = Concatenate(axis=-1)([x1,x2])
predictions = Dense(3, activation='softmax')(x)

model = Model(inputs=[inputs1, inputs2], outputs=predictions)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['acc'])

model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)

Upvotes: 1

Related Questions