Reputation: 111
The accuracy starts off at around 40% and drops down during one epoch to 25%
My model:
self._model = keras.Sequential()
self._model.add(keras.layers.Dense(12, activation=tf.nn.sigmoid)) # hidden layer
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
optimizer = tf.train.AdamOptimizer(0.01)
self._model.compile(optimizer, loss=tf.losses.sparse_softmax_cross_entropy, metrics=["accuracy"])
I have 4 labels, 60k rows of data, split evenly for each label so 15k each and 20k rows of data for evaluation
my data example:
name label
abcTest label1
mete_Test label2
ROMOBO label3
test label4
The input is turned into integers for each character and then hot encoded and output is just turned into integers [0-3]
1 epoch evaluation (loss, acc):
[0.7436684370040894, 0.25]
UPDATE More details about the data
The strings are of up to 20 characters I first convert each character to int based on an alphabet dictionary (a: 1, b:2, c:3) and if a word is shorter than 20 chars i fill the rest with 0's now those values are hot encoded and reshaped so
assume max 5 characters
1. ["abc","d"]
2. [[1,2,3,0,0],[4,0,0,0,0]]
3. [[[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[1,0,0,0,0],[1,0,0,0,0]],[[0,0,0,0,1],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0]]]
4. [[0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0],[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0]]
and labels describe the way a word is spelled basically naming convention e.g. all lowercase - unicase, testBest - camelCase, TestTest - PascalCase, test_test - snake_case
With added 2 extra layers and LR reduced to 0.001 Pic of training
Update 2
self._model = keras.Sequential()
self._model.add(
keras.layers.Embedding(VCDNN.alphabetLen, 12, input_length=VCDNN.maxFeatureLen * VCDNN.alphabetLen))
self._model.add(keras.layers.LSTM(12))
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
self._model.compile(tf.train.AdamOptimizer(self._LR), loss="sparse_categorical_crossentropy",
metrics=self._metrics)
Seems to start and immediately dies with no error (-1073740791)
Upvotes: 1
Views: 1633
Reputation: 184
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
Upvotes: 1
Reputation: 2963
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
Upvotes: 0