Arkleseisure
Arkleseisure

Reputation: 438

Why can the reshape function in keras not change the number of dimensions

I'm attempting to make a chess engine using a neural network made in Keras. I want to output a prediction of the probable policy based off training games, and I am using a 73x8x8 output to do that (each position on the board times 73 different possible moves, 8 directions * 7 squares for the "queen moves", 8 knight moves, 3 promotions (any other promotion is a queen promotion) times 3 directions). However the final layer in my network is a Dense layer, which outputs a single dimensional 4672 long output. I am trying to reshape this into something easier to use through the Reshape layer.

However, it gives me this error: ValueError: Error when checking target: expected reshape_1 to have 4 dimensions, but got array with shape (2, 1)

I have had a look at this question: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (118, 1) but the answer doesn't seem to apply to Dense layers as they do not have a "return sequences" input.

Here is my code:

from keras.models import Model, Input
from keras.layers import Conv2D, Dense, Flatten, Reshape
from keras.optimizers import SGD
import numpy
from copy import deepcopy


class NeuralNetwork:
    def __init__(self):
        self.network = Model()
        self.create_network()

    def create_network(self):
        input = Input((13, 8, 8))
        output = Conv2D(256, (3, 3), padding='same')(input)
        policy_head_output = Conv2D(2, (1, 1))(output)
        policy_head_output = Flatten()(policy_head_output)
        policy_head_output = Dense(4672, name='policy_output')(policy_head_output)
        policy_head_output = Reshape((73, 8, 8), input_shape=(4672,))(policy_head_output)
        value_head_output = Conv2D(1, (1, 1))(output)
        value_head_output = Dense(256)(value_head_output)
        value_head_output = Flatten()(value_head_output)
        value_head_output = Dense(1, name="value_output")(value_head_output)
        self.network = Model(outputs=[value_head_output, policy_head_output], inputs=input)

    def train_network(self, input_training_data, labels):
        sgd = SGD(0.2, 0.9)
        self.network.compile(sgd, 'categorical_crossentropy', metrics=['accuracy'])
        self.network.fit(input_training_data, [labels[0], labels[1]])
        self.network.save("Neural Net 1")


def make_training_data():
    training_data = []
    labels = []
    for i in range(6):
        training_data.append(make_image())
        labels.append(make_label_image())
    return training_data, labels


def make_image():
    data = []
    for i in range(13):
        blank_board = []
        for j in range(8):
            a = []
            for k in range(8):
                a.append(0)
            blank_board.append(a)
        data.append(blank_board)
    return data


def make_label_image():
    policy_logits = []
    blank_board = []
    for i in range(8):
        a = []
        for j in range(8):
            a.append(0)
        blank_board.append(a)

    for i in range(73):
        policy_logits.append(deepcopy(blank_board))
    return [policy_logits, [0]]


def main():
    input_training_data, output_training_data = make_training_data()
    neural_net = NeuralNetwork()
    input_training_data = numpy.array(input_training_data)
    output_training_data = numpy.array(output_training_data)
    neural_net.train_network(input_training_data, output_training_data)


main()

Could someone please explain:

  1. What's happening

  2. What I can do to fix it

Upvotes: 0

Views: 766

Answers (1)

thushv89
thushv89

Reputation: 11333

There's a few things wrong with your approach.

1. Your output/target data

So you're creating a list object with two elements (board, label). Board is 73x8x8 where label is 0/1. This creates inconsistent dimensions. And when you convert this ragged structure to a numpy array this happens.

a = [[0,1,2,3],[0]]
arr = np.array(a)
print(arr)
# => [list([0, 1, 2, 3]) list([0])]

Then data slicing indexing takes a very weird turn and I will not go there. So, first thing is separate out your data, so that each element returned in your make_training_data has consistent dimensions. So here we have the input_image, output_board_image and output_labels returned separately.

def make_training_data():
    training_data = []
    labels = []
    board_output = []
    for i in range(6):
        training_data.append(make_image())
        board, lbl = make_label_image()
        labels.append(lbl)
        board_output.append(board)
    return training_data, board_output, labels

and in the main(), it becomes,

input_training_data, output_training_board, output_training_labels = make_training_data()
input_training_data = np.array(input_training_data)
output_training_board = np.array(output_training_board)
output_training_labels = np.array(output_training_labels)

The error

So you're getting the error

ValueError: Error when checking target: expected reshape_1 to have 4 dimensions, but got array with shape (2, 1)

Well, it's simple, you have given the outputs in the wrong order when doing the model.fit(). In other words, your model says,

outputs=[value_head_output, policy_head_output]

and your make_labels() says,

[policy_logits, [0]]

which is the other way around. Your poor model is trying to reshape labels to that 4 dimensional structure. That's why it complains. So it should be,

neural_net.train_network(input_training_data, [output_training_labels, output_training_board])

Even if you correct just this (without the make_training_data()), you probably won't get this working because of all those inconsistencies in your numpy structure (the first section).

The loss function

This is about your loss function. You have a Dense layer with a single output and you're using categorical_crossentropy which is for "categorical" outputs. You should use binary_crossentropy here, as you only have a single index.

Also, if you want multiple losses for your multiple outputs do the following.

self.network.compile(sgd, ['binary_crossentropy', 'mean_squared_error'], metrics=['accuracy'])

This is just an example. If you want you can have the same loss for both inputs too.

Upvotes: 1

Related Questions