Testing the Keras sentiment classification with model.predict

Question

I have trained the imdb_lstm.py on my PC. Now I want to test the trained network by inputting some text of my own. How do I do it? Thank you!

Lior Magen · Accepted Answer

So what you basically need to do is as follows:

Tokenize sequnces: convert the string into words (features): For example: "hello my name is georgio" to ["hello", "my", "name", "is", "georgio"].
Next, you want to remove stop words (check Google for what stop words are).
This stage is optional, it may lead to faulty results but I think it worth a try. Stem your words (features), that way you'll reduce the number of features which will lead to a faster run. Again, that's optional and might lead to some failures, for example: if you stem the word 'parking' you get 'park' which has a different meaning.
Next thing is to create a dictionary (check Google for that). Each word gets a unique number and from this point we will use this number only.
Computers understand numbers only so we need to talk in their language. We'll take the dictionary from stage 4 and replace each word in our corpus with its matching number.
Now we need to split our data set to two groups: training and testing sets. One (training) will train our NN model and the second (testing) will help us to figure out how good is our NN. You can use Keras' cross validation function.
Next thing is defining whats the max number of features our NN can get as an input. Keras call this parameter - 'maxlen'. But you don't really have to do this manually, Keras can do that automatically just by searching for the longest sentence you have in your corpus.
Next, let's say that Keras found out that the longest sentence in your corpus has 20 words (features) and one of your sentences is the example in the first stage, which its length is 5 (if we'll remove stop words it'll be shorter), in such case we'll need to add zeros, 15 zeros actually. This is called pad sequence, we do that so every input sequence will be in the same length.

Testing the Keras sentiment classification with model.predict

Answers (2)

Related Questions