Reputation: 21
Is it possible to use LSTM together with an array of words that I've classified ?
For example I have a array with 1000 words:
'Green' 'Blue' 'Red' 'Yellow'
I classify the words to be Green = 0, Blue = 1 , Red = 2, Yellow = 3.
And I want to predict the 4th word. The words can come in different orders in the sequence. For example first sequence can be input = green, blue, red, target = yellow next sequence is input = blue,red,yellow, target = green and so on.
Maybe I shouldn't use LSTM for this, but I guess I should since I want to check the 3 earliers inputs and predict the 4th.
This is what I have so far, I'm more or less stuck with the reshape of my words list. and I can't really understand what input_shape I should have. I guess it's Timesteps = 3, and features = 4
# define documents
words = [0,1,2,3,2,3,1,0,0,1,2,3,2,0,3,1,1,2,3,0]
words_cat = to_categorical(words,4)
X_train = ?
y_train = ?
# define the model
model = Sequential()
model.add(LSTM(32, input_shape=(3,4)))
model.add(Dense(4, activation='softmax'))
# compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# summarize the model
print(model.summary())
# fit the model
model.fit(X_train, y_train epochs=50, verbose=0)
Br
Upvotes: 2
Views: 750
Reputation: 2513
As already mentioned by the first comment, a LSTM network is maybe a little bit of an overkill in this case. But I assume you do this for pedagogic reasons.
Here's a working example:
# define documents
words = [0,1,2,3,2,3,1,0,0,1,2,3,2,0,3,1,1,2,3,0]
# create labels
labels = np.roll(words[:-3], -3)
X_train = np.array([words[i:(i+3)%len(words)] for i in range(len(words)-3)]).reshape(-1,1,3)
y_train = labels
# define the model
model = Sequential()
model.add(LSTM(32, input_shape=(None,3)))
model.add(Dense(4, activation='softmax'))
# compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# summarize the model
print(model.summary())
# fit the model
model.fit(X_train, y_train, epochs=5, batch_size=1, verbose=1)
preds = model.predict(X_train).argmax(1)
print(preds)
print(y_train)
Output:
Epoch 1/5
17/17 [==============================] - 2s 88ms/step - loss: 1.3771 - accuracy: 0.1765
Epoch 2/5
17/17 [==============================] - 0s 9ms/step - loss: 1.3647 - accuracy: 0.3529
Epoch 3/5
17/17 [==============================] - 0s 6ms/step - loss: 1.3568 - accuracy: 0.2353
Epoch 4/5
17/17 [==============================] - 0s 8ms/step - loss: 1.3496 - accuracy: 0.2353
Epoch 5/5
17/17 [==============================] - 0s 7ms/step - loss: 1.3420 - accuracy: 0.4118
[1 2 1 2 0 0 0 1 1 2 1 0 2 1 1 1 2]
[3 2 3 1 0 0 1 2 3 2 0 3 1 1 0 1 2]
So I took the words provided by you and reshaped them. The first three entries are the series to train on and the fourth entry is the label.
If your sequence is random the model will have a hard time predicting the next value. Otherwise you might wanna train longer or provide more examples (however the number of combinations in this case is fairly limited).
Upvotes: 1