Reputation: 11824
Hello ML/AI newbie here,
I'm asking this question because I've no idea about machine learning, ai, e.t.c and I've no idea how to continue, what questions to ask. Even if I accidentally find the solution i wouldn't know.
Ok, I followed this tutorial about "Text Classification" and it went pretty well, no problems up to here. https://www.youtube.com/watch?v=6g4O5UOH304&list=WL&index=8&t=0s
It classifies IMDB comments and checks if a review is "Positive" or "Negative", "0" or "1"
My question is
Let say I've my own dataset, similar to IMDB but instead of "0" and "1" I have several categories as numbers like "1,2,3,4,5,6,7,8,9,10,11,12,...." for each string. So I need it to return one of these numbers (since it's learning let say two of them if it can't decide)
What should I do? A link to a tutorial related to what I need would be great too.
import tensorflow as tf
from tensorflow import keras
import numpy as np
data = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = data.load_data(num_words=3000)
word_index = data.get_word_index()
word_index = {k:(v+3) for k, v in word_index.items()}
word_index["<PAD>"] = 0;
word_index["<START>"] = 1;
word_index["<UNK>"] = 2;
word_index["<UNUSED>"] = 3;
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
train_data = keras.preprocessing.sequence.pad_sequences(train_data, value=word_index["<PAD>"], padding="post", maxlen=250)
test_data = keras.preprocessing.sequence.pad_sequences(test_data, value=word_index["<PAD>"], padding="post", maxlen=250)
def decode_review(text):
return " ".join([reverse_word_index.get(i, "?") for i in text])
model = keras.Sequential()
model.add(keras.layers.Embedding(10000, 6))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation="relu"))
model.add(keras.layers.Dense(1, activation="sigmoid"))
#model.summary()
model.compile(optimizer="adam", loss="binary_crossentropy", metrics="accuracy")
x_val = train_data[:10000]
x_train = train_data[10000:]
y_val = train_labels[:10000]
y_train = train_labels[10000:]
fitModel = model.fit(x_train, y_train, epochs=40, batch_size=512, validation_data=(x_val, y_val), verbose=1)
results = model.evaluate(test_data, test_labels)
print(results)
for index in range(20):
test_review = test_data[index]
predict = model.predict([test_review])
if predict[0] > 0.8:
print(decode_review(test_data[index]))
print(str(predict[0]))
print(str(test_labels[index]))
Upvotes: 1
Views: 323
Reputation: 22031
your task is a multiclass classification problem and for this reason, you have to modify your output layer. you have two possibilities.
if you have 1D integer encoded target you can use sparse_categorical_crossentropy as loss function, softmax as the last activation and the dimension of the last dense output equal to the number of class to predict
X = np.random.randint(0,10, (1000,100))
y = np.random.randint(0,3, 1000)
model = Sequential([
Dense(128, input_dim = 100),
Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
Otherwise, if you have one-hot encoded your target you can use categorical_crossentropy, softmax as the last activation and the dimension of the last dense output equal to the number of class to predict
X = np.random.randint(0,10, (1000,100))
y = pd.get_dummies(np.random.randint(0,3, 1000)).values
model = Sequential([
Dense(128, input_dim = 100),
Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
the usage of softmax enables to interpret the output as probability scores which sum to 1
when you compute the final prediction, to obtain the predicted class you can simply to in this way np.argmax(model.predict(X), axis=1)
these are some basic tutorials for multiclass text classification:
Upvotes: 1