Ahmed Dhanani
Ahmed Dhanani

Reputation: 861

Keras simple neural network for NOT logic generating wrong output

I am new to deep learning and I am trying to implement NOT logic in keras. But the results are not correct. Below is the code.

from keras.layers import Input, Dense
from keras.models import Model
import numpy as np

inputs = Input(shape=(1,))
x = Dense(1024, activation='relu')(inputs)
x = Dense(2048, activation='relu')(x)
predictions = Dense(1, activation='softmax')(x)

model = Model(input=inputs, output=predictions)
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

X = np.array([[0.], [1.]], dtype=np.float32)
y = np.array([[1.], [0.]], dtype=np.float32)
print "learning....."
model.fit(X, y, nb_epoch=100)
print model.predict(X)

Output:
On every epoch the output is same:

Epoch 100/100
2/2 [==============================] - 0s - loss: 0.5000 - acc: 0.5000

and the predictions are:

[[ 1.]
 [ 1.]]

I am not sure, what is wrong with this network.

Upvotes: 0

Views: 118

Answers (1)

sascha
sascha

Reputation: 33522

Your usage of the loss looks wrong. Softmax is typically used for multi-class predictions and you already setup the output to be consisting of 2 values using Dense(2). Therefore make your target a multi-class target of dim=2.

from keras.layers import Input, Dense
from keras.models import Model
import numpy as np

inputs = Input(shape=(1,))
x = Dense(1024, activation='relu')(inputs)
x = Dense(2048, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)

model = Model(input=inputs, output=predictions)
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

X = np.array([[0.], [1.]], dtype=np.float32)
y = np.array([[1., 0], [0., 1]], dtype=np.float32)
print "learning....."
model.fit(X, y, nb_epoch=100)
print model.predict(X)

Output

Epoch 100/100
2/2 [==============================] - 0s - loss: 1.5137e-07 - acc: 1.0000
[[  9.99880254e-01   1.19736877e-04]
 [  5.35955711e-04   9.99464035e-01]]

Edit: One can argue if the above setup of final-layer activation and loss-function is a good one for binary-classification (probably not). Link

Alternative using sigmoid and only one target:

from keras.layers import Input, Dense
from keras.models import Model
import numpy as np

inputs = Input(shape=(1,))
x = Dense(1024, activation='relu')(inputs)
x = Dense(2048, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)

model = Model(input=inputs, output=predictions)
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

X = np.array([[0.], [1.]], dtype=np.float32)
y = np.array([1., 0.], dtype=np.float32)
print "learning....."
model.fit(X, y, nb_epoch=100)
print model.predict(X)

Output

Epoch 100/100
2/2 [==============================] - 0s - loss: 9.9477e-07 - acc: 1.0000
[[ 0.99945992]
 [ 0.00129277]]

Upvotes: 1

Related Questions