Reputation: 21
I am trying out multiclass semantic segmentation in Keras. Right now i'm using the Unet architecture, and have a model similar to this (but deeper):
inputs = Input(shape=(512,512,3))
# 128
down1 = Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
down1 = BatchNormalization()(down1)
down1 = Dropout(0.1)(down1)
down1 = Conv2D(32, (3, 3), padding='same', activation='relu')(down1)
down1 = BatchNormalization()(down1)
down1_pool = MaxPooling2D((2, 2))(down1)
center = Conv2D(64, (3, 3), padding='same', activation='relu')(down1_pool)
center = BatchNormalization()(center)
center = Dropout(0.1)(center)
center = Conv2D(64, (3, 3), padding='same', activation='relu')(center)
center = BatchNormalization()(center)
# center
up1 = concatenate([Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(center), down1], axis=3)
up1 = Conv2D(32, (3, 3), padding='same', activation='relu')(up1)
up1 = BatchNormalization()(up1)
up1 = Dropout(0.1)(up1)
up1 = Conv2D(32, (3, 3), padding='same', activation='relu')(up1)
up1 = BatchNormalization()(up1)
# 128
classify = Conv2D(3, (1, 1), activation='softmax')(up1)
model = Model(inputs=inputs, outputs=classify]
model.compile(optimizer=Adam(lr=lr), loss='categorical_crossentropy, metrics=[losses.dice_coeff])
My dataset consists of 680k images (512, 512, 3) and 680k corresponding labels. The labels are one-hot encoded and has shape (512, 512, 3) i.e 3 classes.
And then my question(s): Is this the right way to set up my model? Or should i use 'sigmoid' activation and 'binary_crossentropy'?
Upvotes: 2
Views: 2008
Reputation: 2144
I have the same problem. I didn't found a loss function that made my model to converge. So I used 3 separate model for each label. with dice loss function I had good results for each label. Now I am checking ways to unite all 3 models predictions. In your model softmax is the right activation, and binary and categorial cross entropy are same since your data is binary.
Upvotes: 0
Reputation: 638
if your label are binary go with sigmoid activation and if it is other way around via one hot code i.e the way you are implementing then softmax should be used as activation
Upvotes: 1