Reputation: 11
I'm trying to do sentiment analysis with Keras on my texts using example imdb_lstm.py but I dont know how to test it. I stored my model and weights into file and it look like this:
model = model_from_json(open('my_model_architecture.json').read())
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.load_weights('my_model_weights.h5')
results = model.evaluate(X_test, y_test, batch_size=32)
but ofcourse I dont know how should X_test
and y_test
look like. Would anyone pls help me?
Upvotes: 1
Views: 2065
Reputation: 16587
First, split your dataset to test
, valid
and train
and do some preprocessing:
from tensorflow import keras
print('load data')
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=10000)
word_index = keras.datasets.imdb.get_word_index()
print('preprocessing...')
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=256)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=256)
x_val = x_train[:10000]
y_val = y_train[:10000]
x_train = x_train[10000:]
y_train = y_train[10000:]
As you see we also load word_index
because we need it later to convert our sentence to the sequence of integers.
Second, define your model:
print('build model')
model = keras.Sequential()
model.add(keras.layers.Embedding(10000, 16))
model.add(keras.layers.LSTM(100))
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
print('train model')
model.fit(x_train,
y_train,
epochs=5,
batch_size=512,
validation_data=(x_val, y_val),
verbose=1)
Finally, save
and load
your model with:
print('save trained model...')
model.save('sentiment_keras.h5')
del model
print('load model...')
from keras.models import load_model
model = load_model('sentiment_keras.h5')
You can evaluate your model with test-set
:
print('evaluation')
evaluation = model.evaluate(x_test, y_test, batch_size=512)
print('Loss:', evaluation[0], 'Accuracy:', evaluation[1])
If you want to test the model on completely new sentence you can do:
sample = 'this is new sentence and this very bad bad sentence'
sample_label = 0
# convert input sentence to tokens based on word_index
inps = [word_index[word] for word in sample.split() if word in word_index]
# the sentence length should be the same as the input sentences
inps = keras.preprocessing.sequence.pad_sequences([inps], maxlen=256)
print('Accuracy:', model.evaluate(inps, [sample_label], batch_size=1)[1])
print('Sentiment score: {}'.format(model.predict(inps)[0][0]))
Upvotes: 1