Reputation: 36624
It's a simple model architecture based on this tutorial. The dataset would look like this, although in 10 dimensions:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers, optimizers
from sklearn.datasets import make_blobs
def pre_processing(inputs, targets):
inputs = tf.cast(inputs, tf.float32)
targets = tf.cast(targets, tf.int64)
return inputs, targets
def get_data():
inputs, targets = make_blobs(n_samples=1000, n_features=10, centers=7, cluster_std=1)
data = tf.data.Dataset.from_tensor_slices((inputs, targets))
data = data.map(pre_processing)
data = data.take(count=1000).shuffle(buffer_size=1000).batch(batch_size=256)
return data
model = Sequential([
layers.Dense(8, input_shape=(10,), activation='relu'),
layers.Dense(16, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(7)])
@tf.function
def compute_loss(logits, labels):
return tf.reduce_mean(
tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=labels))
@tf.function
def compute_accuracy(logits, labels):
predictions = tf.argmax(logits, axis=1)
return tf.reduce_mean(tf.cast(tf.equal(predictions, labels), tf.float32))
@tf.function
def train_step(model, optim, x, y):
with tf.GradientTape() as tape:
logits = model(x)
loss = compute_loss(logits, y)
grads = tape.gradient(loss, model.trainable_variables)
optim.apply_gradients(zip(grads, model.trainable_variables))
accuracy = compute_accuracy(logits, y)
return loss, accuracy
def train(epochs, model, optim):
train_ds = get_data()
loss = 0.
acc = 0.
for step, (x, y) in enumerate(train_ds):
loss, acc = train_step(model, optim, x, y)
if step % 500 == 0:
print(f'Epoch {epochs} loss {loss.numpy()} acc {acc.numpy()}')
return loss, acc
optim = optimizers.Adam(learning_rate=1e-6)
for epoch in range(100):
loss, accuracy = train(epoch, model, optim)
Epoch 85 loss 2.530677080154419 acc 0.140625
Epoch 86 loss 3.3184046745300293 acc 0.0
Epoch 87 loss 3.138179063796997 acc 0.30078125
Epoch 88 loss 3.7781732082366943 acc 0.0
Epoch 89 loss 3.4101686477661133 acc 0.14453125
Epoch 90 loss 2.2888522148132324 acc 0.13671875
Epoch 91 loss 5.993691444396973 acc 0.16015625
What have I done wrong?
Upvotes: 0
Views: 51
Reputation: 33420
There are two problems in your code:
The first one is that you are generating a new training dataset in each epoch (see first line of train
function, i.e. get_data
function is called in each epoch). Since you are using sklearn.datasets.make_blobs
function to generate data clusters, there is no guarantee that the generated data clusters between different calls follow the same distribution and/or label mapping. Therefore, the best thing the model could do in each epoch on a completely different dataset is just a random guess (hence, the average 1/7 ~= 0.14 accuracy you see in the results). To resolve this problem, take the data generation out of train
function (i.e. generate the data at global level once by calling get_data
function), and then pass the generated data to train
function as an argument in each epoch.
The second problem is that you are using a very low learning rate, i.e. 1e-6, for the optimizer; therefore, the model is stuck and effectively not training at all. Instead, use the default learning rate for Adam optimizer, i.e. 1e-3, and change it only as needed (e.g. based on the results of experiments you perform).
Upvotes: 1