Reputation: 94
I have trained a model for sentiment analysis on imdb movie review dataset. Now I wish to test it on a custom input i.e. some string for example, "Hello". But I have loaded train and test from 'imdb.pkl' file which returns already preprocessed text which is in tuple of list of list of integers format. I read about this, they say the words are assigned with integers. So my question is how do I convert my custom input (or encode the string) to that format so that I would be able to use it using model.predict(custom_input)?
train, test, _ = imdb.load_data(path='imdb.pkl', n_words=10000)
train
([[17, 25, 10, 406, 26, 14, 56, 61, 62, 323, 4],
[16, 586, 32, 885, 17, 39, 68, 31, 2994, 2389, 328, 4],
[1, 2, 1, 139, 6, 130, 1, 5, 6, 25, 105, 4730, 40],
[30, 287, 142, 2216, 707, 3763, 20, 68, 57, 30, 37, 309, 14, 4],
[224, 3, 371, 3, 1, 4, 128, 37, 16, 90, 48, 1247, 8, 79, 294, 913, 1709,4],
[17,
10,
2,....]])
type(train)
tuple
type(train[0])
list
type(train[0][0])
list
type(train[0][0][0])
int
Upvotes: 2
Views: 356