Reputation: 3
I am new to ML, and I would like to use keras to categorize every number in a sequence as a 1 or 0 depending on whether it is greater than the previous number. That is, if I had:
sequence a = [1, 2, 6, 4, 5],
The solution should be: sequence b = [0, 1, 1, 0, 1].
So far, I have written:
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense
model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1,1])])
model.add(tf.keras.layers.Dense(17))
model.add(tf.keras.layers.Dense(17))
model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])
b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
a = [0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0]
b = np.array(b, dtype=float)
a = np.array(a, dtype=float)
model.fit(b, a, epochs=500, batch_size=1)
# # Generate predictions for samples
predictions = model.predict(b)
print(predictions)
When I do this, I end up with:
Epoch 500/500
17/17 [==============================] - 0s 499us/step - loss: 7.9229 - binary_accuracy: 0.4844
[[[-1.37064695e+01 4.70858345e+01 -4.67341652e+01 -1.94298875e+00
5.75960045e+01 6.70146179e+01 6.34545479e+01 -4.86319550e+02
2.26250134e+01 -8.60109329e+00 -4.03220863e+01 -1.67574768e+01
3.36148148e+01 -4.55171967e+00 -1.39924898e+01 6.31023712e+01
-9.14120102e+00]]
[[-6.92644653e+01 2.40270264e+02 -2.37715302e+02 -9.42625141e+00
2.93314209e+02 3.41092743e+02 3.23760315e+02 -2.49306396e+03
1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
1.70274872e+02 -2.48692398e+01 -7.15372696e+01 3.22131958e+02
-4.57872620e+01]]
[[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01
3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03
1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02
-6.04456978e+01]]
[[-3.59296684e+01 1.24359612e+02 -1.23126640e+02 -4.93629456e+00
1.51883270e+02 1.76645889e+02 1.67576874e+02 -1.28901733e+03
5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
8.82788391e+01 -1.26787395e+01 -3.70104065e+01 1.66714172e+02
-2.37996235e+01]]
[[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00
2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03
9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02
-3.84580460e+01]]
[[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01
3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03
1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02
-6.04456978e+01]]
[[-1.00263879e+03 3.48576855e+03 -3.44619800e+03 -1.35145050e+02
4.25337939e+03 4.94560596e+03 4.69689697e+03 -3.62063594e+04
1.67120789e+03 -6.35745117e+02 -2.98891406e+03 -1.22816174e+03
2.46616406e+03 -3.66204163e+02 -1.03828992e+03 4.67382764e+03
-6.61441223e+02]]
[[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00
2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03
9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02
-3.84580460e+01]]
[[-4.80280518e+03 1.66995840e+04 -1.65093086e+04 -6.47000305e+02
2.03765059e+04 2.36925508e+04 2.25018145e+04 -1.73467625e+05
8.00621289e+03 -3.04566919e+03 -1.43194590e+04 -5.88322070e+03
1.18137129e+04 -1.75592432e+03 -4.97435352e+03 2.23914492e+04
-3.16803076e+03]]
[[-3.59296684e+01 1.24359612e+02 -1.23126640e+02 -4.93629456e+00
1.51883270e+02 1.76645889e+02 1.67576874e+02 -1.28901733e+03
5.96718216e+01 -2.26942272e+01 -1.06582588e+02 -4.39800491e+01
8.82788391e+01 -1.26787395e+01 -3.70104065e+01 1.66714172e+02
-2.37996235e+01]]
[[-5.81528549e+01 2.01633392e+02 -1.99519104e+02 -7.92959309e+00
2.46170563e+02 2.86277161e+02 2.71699158e+02 -2.09171509e+03
9.67186279e+01 -3.67873497e+01 -1.72843094e+02 -7.12026062e+01
1.42942856e+02 -2.08057709e+01 -6.00283318e+01 2.70326050e+02
-3.84580460e+01]]
[[-6.92644653e+01 2.40270264e+02 -2.37715302e+02 -9.42625141e+00
2.93314209e+02 3.41092743e+02 3.23760315e+02 -2.49306396e+03
1.15242020e+02 -4.38339310e+01 -2.05973328e+02 -8.48139114e+01
1.70274872e+02 -2.48692398e+01 -7.15372696e+01 3.22131958e+02
-4.57872620e+01]]
[[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01
3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03
1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02
-6.04456978e+01]]
[[-9.14876480e+01 3.17544006e+02 -3.14107819e+02 -1.24195509e+01
3.87601562e+02 4.50723969e+02 4.27882660e+02 -3.29576172e+03
1.52288818e+02 -5.79270554e+01 -2.72233856e+02 -1.12036469e+02
2.24938889e+02 -3.29962883e+01 -9.45551834e+01 4.25743744e+02
-6.04456978e+01]]
[[-4.70412598e+01 1.62996490e+02 -1.61322891e+02 -6.43295908e+00
1.99026932e+02 2.31461517e+02 2.19638016e+02 -1.69036609e+03
7.81952209e+01 -2.97407875e+01 -1.39712814e+02 -5.75913391e+01
1.15610855e+02 -1.67422562e+01 -4.85193672e+01 2.18520096e+02
-3.11288433e+01]]
[[-2.60270850e+03 9.04948047e+03 -8.94645508e+03 -3.50663330e+02
1.10420654e+04 1.28390557e+04 1.21937041e+04 -9.40005859e+04
4.33857861e+03 -1.65045227e+03 -7.75966846e+03 -3.18818774e+03
6.40197412e+03 -9.51349304e+02 -2.69557886e+03 1.21338779e+04
-1.71684766e+03]]
[[-2.59487200e+00 8.44894505e+00 -8.53793907e+00 -4.46333081e-01
1.04523640e+01 1.21989994e+01 1.13933916e+01 -8.49708328e+01
4.10160637e+00 -1.55452514e+00 -7.19183874e+00 -3.14619255e+00
6.28279734e+00 -4.88203079e-01 -2.48353434e+00 1.12964716e+01
-1.81198704e+00]]]
Upvotes: 0
Views: 475
Reputation: 19322
There are few issues with how you are approaching this -
Your setup for the deep learning problem is flawed. You want to use the information of the previous element to infer the labels for the next element. But for inference (and training), you only pass the current element. If tomorrow I deploy this model, imagine what would happen. The only information I will provide you, say, "15" and as you if it's bigger than the previous element, which doesn't exist. How will your model respond?
Secondly, why are your output layer is predicting a 17-dimensional vector? Shouldn't the goal be to predict a 0 or 1 (probability)? In that case your output should be a single element with sigmoid activation. Refer to this diagram as a guide for your future setups for neural networks.
#2 layer neural network without activation
h = W1.X+B1
o = W2.h+B2
o = W2.(W1.X+B1)+B2
= W2.W1.X + (W1.B1+B2)
= W3.X + B3 #Same as linear regression!
#2 layer neural network with activations.
h = activation(W1.X+B1)
o = activation(W2.h+B2)
I would advise starting from basics of neural networks to first build best practices, then jumping into making your own problem statements. The Keras author Fchollet
has some excellent starter notebooks that you can explore.
For your case, try these modifications -
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense
#Modify input shape and output shape + add activations
model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=(2,))]) #<------
model.add(tf.keras.layers.Dense(17, activation='relu')) #<------
model.add(tf.keras.layers.Dense(1, activation='sigmoid')) #<------
model.compile(optimizer='sgd', loss='BinaryCrossentropy', metrics=['binary_accuracy'])
#create 2 features, 1st is previous element 2nd is current element
b = [1,6,8,3,5,8,90,5,432,3,5,6,8,8,4,234,0]
b = np.array([i for i in zip(b,b[1:])]) #<---- (16,2)
#Start from first paid of elements
a = np.array([0,1,1,0,1,1,1,0,1,0,1,1,1,0,0,1,0])[1:] #<---- (16,)
model.fit(b, a, epochs=20, batch_size=1)
# # Generate predictions for samples
predictions = model.predict(b)
print(np.round(predictions))
Epoch 1/20
16/16 [==============================] - 0s 1ms/step - loss: 3.0769 - binary_accuracy: 0.7086
Epoch 2/20
16/16 [==============================] - 0s 823us/step - loss: 252.6490 - binary_accuracy: 0.6153
Epoch 3/20
16/16 [==============================] - 0s 1ms/step - loss: 3.8109 - binary_accuracy: 0.9212
Epoch 4/20
16/16 [==============================] - 0s 787us/step - loss: 0.0131 - binary_accuracy: 0.9845
Epoch 5/20
16/16 [==============================] - 0s 2ms/step - loss: 0.0767 - binary_accuracy: 1.0000
Epoch 6/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0143 - binary_accuracy: 0.9800
Epoch 7/20
16/16 [==============================] - 0s 2ms/step - loss: 0.0111 - binary_accuracy: 1.0000
Epoch 8/20
16/16 [==============================] - 0s 2ms/step - loss: 4.0658e-04 - binary_accuracy: 1.0000
Epoch 9/20
16/16 [==============================] - 0s 941us/step - loss: 6.3996e-04 - binary_accuracy: 1.0000
Epoch 10/20
16/16 [==============================] - 0s 1ms/step - loss: 1.1477e-04 - binary_accuracy: 1.0000
Epoch 11/20
16/16 [==============================] - 0s 837us/step - loss: 6.8807e-04 - binary_accuracy: 1.0000
Epoch 12/20
16/16 [==============================] - 0s 2ms/step - loss: 5.0521e-04 - binary_accuracy: 1.0000
Epoch 13/20
16/16 [==============================] - 0s 851us/step - loss: 0.0015 - binary_accuracy: 1.0000
Epoch 14/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0012 - binary_accuracy: 1.0000
Epoch 15/20
16/16 [==============================] - 0s 765us/step - loss: 0.0014 - binary_accuracy: 1.0000
Epoch 16/20
16/16 [==============================] - 0s 906us/step - loss: 3.9230e-04 - binary_accuracy: 1.0000
Epoch 17/20
16/16 [==============================] - 0s 1ms/step - loss: 0.0022 - binary_accuracy: 1.0000
Epoch 18/20
16/16 [==============================] - 0s 1ms/step - loss: 2.2149e-04 - binary_accuracy: 1.0000
Epoch 19/20
16/16 [==============================] - 0s 2ms/step - loss: 1.7345e-04 - binary_accuracy: 1.0000
Epoch 20/20
16/16 [==============================] - 0s 1ms/step - loss: 7.7950e-05 - binary_accuracy: 1.0000
[[1.]
[1.]
[0.]
[1.]
[1.]
[1.]
[0.]
[1.]
[0.]
[1.]
[1.]
[1.]
[0.]
[0.]
[1.]
[0.]]
The above model is easy to train since the problem is not a complex problem. You can see that the accuracy goes to 100% very quickly. Let's try to make predictions on unseen data with this new model -
np.round(model.predict([[5,1], #<- Is 5 < 1
[5,500], #<- Is 5 < 500
[5,6]])) #<- Is 5 < 6
array([[0.], #<- No
[1.], #<- Yes
[1.]], dtype=float32) #<- Yes
Upvotes: 2
Reputation: 941
The problem is that your output layer has 17 neurons. This does not make sense. You would want to have 1 or 2 neurons at the output for a binary choice like this.
Change the last layer to:
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
You will than get for each input an output prediction. As you get probabilities and not 1 and 0 values, you will have to round them with e.g. np.round.
Sigmoid actiavtion function is used, to get probabilities between 0 and 1. 1 Output neuron is used, as your output is a binary choice and there is only 1 state that it can have.
However, this simply solves your issues in the code. I would argue that a Dense neural network is NOT the right choice for your problem and will probably have a hard time learning anything useful.
Upvotes: 0