Reputation: 59
I'm creating a CNN which does multilabel classification of CXRs with 14 different classes which can coexist. (https://stanfordmlgroup.github.io/competitions/chexpert/). I use Python with Keras and Tensorflow and right now I am trying to get the code to work (with a small test CNN) and I get the error "ValueError: logits and labels must have the same shape ((None, 14) vs (None, 1))" I have used sigmoid activation function with binary cross entropy loss. I think there might be going something wrong with the creation of the train and validation dataset. I have used the ImageDataGenerator.flow_from_dataframe function with a pandas dataframe with a column for all 14 labels (0 or 1) as shown in the image. Pandas Dataframe Structure
I have searched for the same problem on stackoverflow/github, but they mostly don't use the ImageDataGenerator and have to resize X or Y, but I don't know how I would have to do that. Does anyone know what is going wrong? Thanks in advance! My code is below.
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
import time
import h5py
df = pd.read_csv('D:\\Milou\\CheXpert-v1.0-small\\train.csv', usecols = [0, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18])
df = df.fillna(0) # Change NaN values to 0
df = df.convert_dtypes() # Change datatypes from float to integer if possible
df = df.replace({-1 : 0}) # Regard the uncertain labels '-1' as negative
print(df.head(5))
label_names = ["No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly", "Lung Opacity", "Lung Lesion", "Edema", "Consolidation", "Pneumonia", "Atelectasis", "Pneumothorax", "Pleural Effusion", "Pleural Other", "Fracture", "Support Devices"]
datagen=ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_generator=datagen.flow_from_dataframe(dataframe=df,
directory="D:\\Milou",
x_col="Path",
y_col=label_names,
subset="training",
class_mode="multi_output",
target_size=(100,100),
batch_size=64)
validation_generator=datagen.flow_from_dataframe(dataframe=df,
directory="D:\\Milou",
x_col="Path",
y_col=label_names,
subset="validation",
class_mode="multi_output",
target_size=(100,100),
batch_size=64)
# Creating model
NAME = "Test-{}".format(int(time.time()))
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(100,100,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(14))
model.add(Activation('sigmoid'))
tensorboard = TensorBoard(log_dir="logs_multilabel\\{}".format(NAME))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'],
)
step_size_train = train_generator.n//train_generator.batch_size
model.fit(train_generator,
steps_per_epoch = step_size_train,
epochs=3,
validation_data= validation_generator,
callbacks=[tensorboard])
First time asking a question on stackoverflow so don't hesitate to give feedback on missing info etc!
Upvotes: 3
Views: 6251
Reputation: 1
you need to flatten your network before final layer, because of this unflatten input you are getting all these kind of error. In reality at the final layer you are passing multi dimensional array
tf.keras.layers.Flatten()
Upvotes: 0
Reputation: 1
Use class_mode = 'raw' as below:
train_generator=datagen.flow_from_dataframe(dataframe=df,
directory="D:\\Milou",
x_col="Path",
y_col=label_names,
subset="training",
class_mode="raw",
target_size=(100,100),
batch_size=64)
Upvotes: 0
Reputation: 36624
Since you have 2 different classes, you should have 2 neurons in your final dense
layer, instead of the number of observations you have. The number of observations should not be specified in your neural network architecture.
Upvotes: 2