Vluedo404
Vluedo404

Reputation: 3

Can't fix ValueError: Building a simple neural network model in Keras

I am new to TensorFlow and Keras and I wanted to build a simple neural network in Keras that can count from 0 to 7 in binary (i.e. 000-111). The network should have:

It sounds simple but I have problems with building the model. I get the following error:

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (3,)

The code I've tried so far:

import plaidml.keras
plaidml.keras.install_backend()
import os
os.environ["KERAS_BACKEND"] = plaidml.keras.backend

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import numpy as np

x_train = [ [0.0, 0.0, 0.0], [0.0, 0.0, 1.0], [0.0, 1.0, 0.0], [0.0, 1.0, 1.0],
[1.0, 0.0, 0.0], [1.0, 0.0, 1.0], [1.0, 1.0, 0.0], [1.0, 1.0, 1.0]]

y_train = [ [0.0, 0.0, 1.0], [0.0, 1.0, 0.0], [0.0, 1.0, 1.0], [1.0, 0.0, 0.0],
[1.0, 0.0, 1.0], [1.0, 1.0, 0.0], [1.0, 1.0, 1.0], [0.0, 0.0, 0.0]]

x_train = np.array(x_train)
y_train = np.array(y_train)

x_test = x_train
y_test = y_train

print(x_train)
print(y_train)
print("x_test_len", len(x_test))
print("y_test_len", len(y_test))

# Build a CNN model. You should see INFO:plaidml:Opening device xxx after you run this chunk
model = keras.Sequential()
model.add(Dense(input_dim=3, output_dim=8, activation='relu'))
model.add(Dense(input_dim=8, output_dim=3, activation='relu'))

# Compile the model
model.compile(optimizer='adam', loss=keras.losses.sparse_categorical_crossentropy, metrics=['accura cy'])

# Fit the model on training set
model.fit(x_train, y_train, epochs=10)

# Evaluate the model on test set
score = model.evaluate(x_test, y_test, verbose=0)
# Print test accuracy
print('\n', 'Test accuracy ', score[1])

I think there are probably a couple of things I did't get right.

Upvotes: 0

Views: 84

Answers (1)

today
today

Reputation: 33410

There are two problems:

  1. You are using 'relu' as the activation of last layer (i.e. output layer), however your model should produce vectors of zeros and ones. Therefore, you need to use 'sigmoid' instead.

  2. Since 'sigmoid' would be used for the activation of last layer, you also need to use binary_crossentropy for the loss function.

To give you a better understanding of the problem, you can view it as a multi-label classification problem: each of the 3 nodes in the output layer should act as an independent binary classifier (hence using sigmoid as the activation and binary crossentropy as the loss function).

Upvotes: 1

Related Questions