Weimin Chan
Weimin Chan

Reputation: 347

Keras-SegNet use ImageDataGenerator and fit or fit_generator

Problems about import data set is going to drive me crazy.

This is part of my segnet code.

I'll focus on the questions about image&mask data import.

print("CNN Model created.")

###training data
data_gen_args = dict()
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)
seed1 = 1
image_datagen.fit(images, augment=True, seed=seed1)
mask_datagen.fit(masks, augment=True, seed=seed1)

train_image_generator = image_datagen.flow_from_directory(TRAIN_im,target_size=(500, 500),batch_size=BATCH_SIZE, class_mode = None)
train_mask_generator = mask_datagen.flow_from_directory(TRAIN_mask,target_size=(500, 500),batch_size=BATCH_SIZE, class_mode = None)

train_generator = zip(train_image_generator,train_mask_generator)

###validation data
valid_gen_args = dict()
val_image_datagen = ImageDataGenerator(**valid_gen_args)
val_mask_datagen = ImageDataGenerator(**valid_gen_args)
seed2 = 5
val_image_datagen.fit(val_images, augment=True, seed=seed2)
val_mask_datagen.fit(val_masks, augment=True, seed=seed2)

val_image_generator = val_image_datagen.flow_from_directory(VAL_im,target_size=(500, 500),batch_size=BATCH_SIZE, class_mode = None)
val_mask_generator = val_mask_datagen.flow_from_directory(VAL_mask,target_size=(500, 500),batch_size=BATCH_SIZE, class_mode = None)

val_generator = zip(val_image_generator,val_mask_generator)

###
model.fit_generator(
train_generator,steps_per_epoch=nb_train_samples//BATCH_SIZE,epochs=EPOCHS,validation_data=val_generator,validation_steps=nb_validation_samples//BATCH_SIZE)

My question is:

  1. I change the input size to 500*500, so I adjusted the pooling and upsample layers size. Is this achievable? Further more, can I make the classic net(Like AlexNet, VGG, Segnet...) accept arbitrary input image size by adjusting their pooling and upsample layers size and filter numbers?

  2. I'd like to know what is the data type of variables "images" and "masks" in:

    image_datagen.fit(images, augment=True, seed=seed1)
    mask_datagen.fit(masks, augment=True, seed=seed1)
    

    This part is from Keras official tutorial.(Ans:Now I know they both are numpy arrays.)

  3. According to the question above. How can I derive them?

    Should I write a function like mnist.load_data() below?

    I need some examples.

        (x_train_image, y_train_label), (x_test_image, y_test_label) = mnist.load_data()
    
  4. I use the function

    flow_from_directory
    

    Does it mean there is no need to define a function like "mnist.load_data()" by myself, and I can use it to get the (batch,shuffle) data directly from my directory structure?

This is my directory structure:

Dataset -training----------images----"many images"
           |         |
           |         |-----mask-----"ground truth images(mask)" 
           |
           |
       validation----------val_images----"many images"
           |        |
           |        |------val_mask------"ground truth images(mask)" 
           |
           |
        testing---------------test images (no ground truth)

Thanks a lot!

Upvotes: 4

Views: 1313

Answers (1)

pietz
pietz

Reputation: 2553

Let's get to it.

  1. SegNet is a FCN (fully convolutional network --> it doesn't use dense layers) that works with any input/output size you specify. Let me just recommend to use mutiples of 16 for these encoder-decoder architectures. Why? Because in your case we will go from 500 to 250 to 125 to 62 and on the other side from 62 to 124 to 248 to 496. Suddenly your resolutions don't match anymore. AlexNet and VGG use dense layers. This means you can change the initial input size that fits your need, but you won't be able to use pretrained weights from different resolutions. The number of parameters simply don't match. Side note: VGG and AlexNet are classification architectures, whereas SegNet is a segmentation architecture.
  2. images and masks are 4-dimensional numpy arrays with a shape of (num_imgs, width, height, num_channels). Where are these variables coming from? You must have read them from their respective image files in an earlier step.
  3. You want to iterate over each of the two folder, read each image, add them to a list and once you're done convert this list to a numpy array. Make sure images and masks are sorted the same way so that they match each other.
  4. flow_from_directory is a function that can be used with a IDG in order to read images for you. Very handy. However, it only let's you get around this, if you don't need featurewise_center, featurewise_std_normalization and zca_whitening because in this case you need already available numpy arrays in order to execute the IDG fit() function. Btw, this fit function has nothing to do with the fit() function that starts the training of your model. It just uses the same naming convention.

Upvotes: 2

Related Questions