Reputation: 145
I am just starting in Deep Learning and I have to implement a model in Python that can detect the inference between two sentences (label is neutral, contradiction or entailment). The data set is formatted as follows:
| index | sentence_1 | sentence_2 | label |
|------------------------------------------------|
| 1 | blabla | blabla | neutral|
To accomplish this task, I chose to use Keras which seemed relatively easy to use.
I encoded the sentences with Glove embedded vectors (dim=50
) and then padded them with maxlen=80
.
I end up with a new panda dataframe :
| index | sentence_1_padded | sentence_2_padded | label | target |
|--------------------------------------------------------------------|--------|
| 1 | matrix 80*50 | matrix 80*50 | neutral | 2 |
| ... | ... | ... | ... | ... |
| 5000 | matrix 80*50 | matrix 80*50 | contradiction| 0 |
Each embedded vector and each sequence is a numpy array.
I want to train my model using this transformed dataset.
So I built this model :
vocab_size = len(glove_wordmap)+1
X = dataset_processed[['sentence_1_padded', 'sentence_2_padded']]
y = dataset_processed[['target']]
inputs = Input(shape=(2,))
embedding_layer = Embedding(vocab_size, 128)(inputs)
x = LSTM(64)(embedding_layer)
x = Dense(32, activation='relu')(x)
predictions = Dense(3, activation='softmax')(x)
model = Model(inputs=[inputs], outputs=predictions)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['acc'])
model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)
and I have the following error that keeps coming back:
Train on 3750 samples, validate on 1250 samples
Epoch 1/5
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-248-ddab59c5a43a> in <module>()
----> 1 model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)
4 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
1176 steps_per_epoch=steps_per_epoch,
1177 validation_steps=validation_steps,
-> 1178 validation_freq=validation_freq)
1179
1180 def evaluate(self,
/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py in fit_loop(model, fit_function, fit_inputs, out_labels, batch_size, epochs, verbose, callbacks, val_function, val_inputs, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps, validation_freq)
202 ins_batch[i] = ins_batch[i].toarray()
203
--> 204 outs = fit_function(ins_batch)
205 outs = to_list(outs)
206 for l, o in zip(out_labels, outs):
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs)
2977 return self._legacy_call(inputs)
2978
-> 2979 return self._call(inputs)
2980 else:
2981 if py_any(is_tensor(x) for x in inputs):
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in _call(self, inputs)
2915 array_vals.append(
2916 np.asarray(value,
-> 2917 dtype=tf.as_dtype(tensor.dtype).as_numpy_dtype))
2918 if self.feed_dict:
2919 for key in sorted(self.feed_dict.keys()):
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
ValueError: setting an array element with a sequence.
I have read this post, but it doesn't help since I've checked the dimensions and the type of elements in X, and each sequence in each column is a numpy array of shape (80,50).
I would really be grateful to the person who has a solution to my problem, or to someone who knows a beginner's tutorial to solve this kind of problem easily.
Thank you for your help !
PS: Feel free to tell me if I'm off to a bad start to solve this problem
Upvotes: 0
Views: 627
Reputation: 11333
You're trying to solve a problem of sentence entailment. This means that you need to have two streams of network flows in your graph (i.e. one for each sentence). The main problem is that you have defined an Input
layer of size (None,2)
. But your input has a sequence length of 80 (probably has the size (None, 80, 50, 2)
.
Another problem is that, your sentence_1_padded
and sentence_2_padded
needs to be (5000,80)
, not (5000,80,50)
. Because your embedding layer expects word IDs (not the GloVe embeddings). If you want the GloVe embeddings, you need to initialize your Embedding
layer with the GloVe vectors.
Therefore, you need to do the following changes.
(80,)
sentence_1_padded
and setence_2_padded
as elements of 80x50
(after performing embedding lookup), you need to have them as word IDs of sequence length (80)
data['sentence_1_padded']
for example, it should return a (5000, 80)
matrix.# Toy data (dataset size 250)
# X1 = np.random.randint(0,100,size=(250,80))
# X2 = np.random.randint(0,100,size=(250,80))
# y = np.random.choice([0,1,2], size=(250,))
X = [dataset_processed['sentence_1_padded'], dataset_processed['sentence_2_padded']]
inputs1 = Input(shape=(80,))
inputs2 = Input(shape=(80,))
embedding_layer = Embedding(vocab_size, 128)
emb_out1 = embedding_layer(inputs1)
emb_out2 = embedding_layer(inputs2)
lstm_layer = LSTM(64)
x1 = lstm_layer(emb_out1)
x2 = lstm_layer(emb_out2)
dense = Dense(32, activation='relu')
x1 = dense(x1)
x2 = dense(x2)
x = Concatenate(axis=-1)([x1,x2])
predictions = Dense(3, activation='softmax')(x)
model = Model(inputs=[inputs1, inputs2], outputs=predictions)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['acc'])
model.fit(X, to_categorical(y), epochs=5, batch_size=32, validation_split=0.25)
Upvotes: 1