Reputation: 893
I've searched several similar topics covering similar problems. For example this, this and this, among others. Despite this, I still haven't managed to solve it.
What I'm ultimately trying to do is predicting three parameters using CNNs. The inputs are matrices (which can now be plotted as RGB images after pre-processing) with the initial size of (3724, 4073, 3). Due to the size of the data set I'm feeding the CNN in batches of 16 using the following generator:
class My_Generator(Sequence):
""" Generates batches of training data and ground truth. Inputs are the image paths and batch size. """
def __init__(self, image_paths, batch_size, normalise=True):
self.image_paths, self.batch_size = image_paths, batch_size
self.normalise = normalise
def __len__(self):
return int(np.ceil(len(self.image_paths) / float(self.batch_size)))
def __getitem__(self, idx):
batch = self.image_paths[idx * self.batch_size:(idx + 1) * self.batch_size]
matrices, parameters = [], []
for file_path in batch:
mat, param, name = get_Matrix_and_Parameters(file_path)
#Transform the matrix from 2D to 3D as a (mat.shape[0], mat.shape[1]) RBG image. Rescale its values to [0,1]
mat = skimage.transform.resize(mat, (mat.shape[0]//8, mat.shape[1]//8, 3),
mode='constant', preserve_range=self.normalise)
param = MMscale_param(param, name) # Rescale the parameters
matrices.append(mat)
parameters.append(param)
MAT, PAM = np.array(matrices), np.array(parameters)
PAM = np.reshape(PAM, (PAM.shape[0], PAM.shape[1]))
print("Shape Matrices: {0}, Shape Parameters: {1}".format(MAT.shape, PAM.shape))
print("Individual PAM shape: {0}".format(PAM[0,:].shape))
return MAT, PAM
The generator is also resizing the matrices by 8 times to fit into memory. The function MMscale_param is simply rescaling the parameters to [0, 1].
The generated batches now have shape (16, 465, 509, 3). These are now fed into the following CNN architecture:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 463, 507, 16) 448
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 231, 253, 16) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 229, 251, 32) 4640
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 114, 125, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 112, 123, 64) 18496
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 56, 61, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 54, 59, 128) 73856
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 27, 29, 128) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 25, 27, 256) 295168
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 12, 13, 256) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 39936) 0
_________________________________________________________________
dense_1 (Dense) (None, 1000) 39937000
_________________________________________________________________
dense_2 (Dense) (None, 100) 100100
_________________________________________________________________
dense_3 (Dense) (None, 20) 2020
_________________________________________________________________
dense_4 (Dense) (None, 3) 63
=================================================================
Total params: 40,431,791
Trainable params: 40,431,791
Non-trainable params: 0
_________________________________________________________________
As displayed above, the last layer in the model expects the input to be (None, 3). If I understand this correct, "any" batch size value could be replaced by "None" here, so my input, (16, 3) or (batch_size
, number_of_parameters_to_predict
), should be valid. However, I'm still getting the following error message:
ValueError: Error when checking target: expected dense_4 to have shape (1,) but got array with shape (3,)
What I find to be very strange is the claim that Dense layer dense_4 has shape (1, ). But isn't it displayed in the architecture above that it's a (3, ) shape? This should then fit well with the input array's shape (3, ).
I've tried to reshape and/or transpose the array in several ways but without success. I've even uninstalled and reinstalled TensorFlow and Keras in the belief that something was wrong there, but still nothing.
What seem to work however, is to only predict one of the three parameters, giving us an input shape of (1, 0). (Later yielding other, memory related, errors though.) This actually works independently of how I shape the dense_4 layer, meaning that both (None, 1) and (None, 3) works, which according to my limited knowledge, doesn't make any sense.
Adding the compilation;
batch_size = 16
my_training_batch_generator_NIR = My_Generator(training_paths_NIR, batch_size)
my_validation_batch_generator_NIR = My_Generator(validation_paths_NIR, batch_size)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
and the training code as well:
model_path = "/Models/weights.best.hdf5"
num_epochs = 10
checkpointer = ModelCheckpoint(filepath=model_path,
verbose=1,
save_best_only=True)
model.fit_generator(generator=my_training_batch_generator_NIR,
steps_per_epoch=(len(validation_paths_NIR) // batch_size),
epochs=num_epochs,
verbose=1,
callbacks=[checkpointer],
validation_data=my_validation_batch_generator_NIR,
validation_steps=(len(validation_paths_NIR) // batch_size),
use_multiprocessing=True,
max_queue_size=1,
workers=1)
So, to sum up: I'm having problems fitting a (3, ) array into, what I believe is, a (3, ) layer. However, the latter is claimed to be of shape (1, ). I must be missing out on something here.
Any help would be highly appreciated.
I'm using Keras version 2.2.2 with TensorFlow 1.9.0 backend on Ubuntu.
Upvotes: 0
Views: 616
Reputation: 2680
This is because of loss function you are using. Replace that with
loss='categorical_crossentropy'
Code should work then.
Upvotes: 1