Tarun
Tarun

Reputation: 182

Sequential Image classification

I have 100s of tif files that contain multiple images in itself. I want to create a binary classifier. First, I broke down all the tif to png images (eg. 2 tif file contains 20 and 30 images respectively, then converted into 50 png images(600 x 600) in another directory). Then, I applied CNN on it but the results were not up to the mark. The images of tif are sequential in nature and contain important information which might be relevant for classification purpose. Now, I'm trying to apply CNN+LSTM for this purpose. I have a csv file containing the filename and the label and i'm using ImageGenerator's flow_from_Dataframe to load the data. Here's the code:-

img_width, img_height = 600, 600
no_frame = 5
original_train = "PATH TO IMAGES"
nb_training_samples = 6587
nb_validation_samples = 1646
epochs = 1
batch_size = 32
lr = 0.001
    
if k.image_data_format() == "channels_first":
    input_shape = (3,img_width,img_height)
else:
    input_shape = (img_width, img_height,3)
    
METRICS = [
  metrics.TruePositives(name='tp'),
  metrics.FalsePositives(name='fp'),
  metrics.TrueNegatives(name='tn'),
  metrics.FalseNegatives(name='fn'),
  metrics.BinaryAccuracy(name='accuracy'),
  metrics.Precision(name='precision'),
  metrics.Recall(name='recall'),
  metrics.AUC(name='auc'),
]

model = Sequential()
model.add(ConvLSTM2D(filters = 32, kernel_size=(3,3),
                    activation='relu',
                    return_sequences=True,
                    padding='same',
                    input_shape=(None,img_width, img_height,3)))
model.add(BatchNormalization())
model.add(ConvLSTM2D(64,(3,3), activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=METRICS)

model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_lst_m2d_10 (ConvLSTM2D) (None, None, 600, 600, 32 40448     
_________________________________________________________________
batch_normalization_9 (Batch (None, None, 600, 600, 32 128       
_________________________________________________________________
conv_lst_m2d_11 (ConvLSTM2D) (None, 600, 600, 64)      221440    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 300, 300, 64)      0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 5760000)           0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                368640064 
_________________________________________________________________
activation_2 (Activation)    (None, 64)                0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 130       
_________________________________________________________________
activation_3 (Activation)    (None, 2)                 0         
=================================================================
Total params: 368,902,210
Trainable params: 368,902,146
Non-trainable params: 64
_________________________________________________________________

datagen = ImageDataGenerator(rescale=1/255., validation_split=0.2)

train_generator = datagen.flow_from_dataframe(dataframe=data, directory=original_train,
                                             x_col='Id',
                                             y_col='label',
                                             target_size=(img_width,img_height),
                                             class_mode='categorical',
                                             batch_size=batch_size,
                                             subset='training',
                                             seed=7)

print(train_generator.class_indices)

validation_generator = datagen.flow_from_dataframe(dataframe=data, directory=original_train,
                                             x_col='Id',
                                             y_col='label',
                                             target_size=(img_width,img_height),
                                             class_mode='categorical',
                                             batch_size=batch_size,
                                             subset='validation',
                                             seed=7)

print(validation_generator.class_indices)

train_steps = train_generator.n//train_generator.batch_size
validation_steps = validation_generator.n//validation_generator.batch_size


history = model.fit_generator(train_generator,steps_per_epoch=train_steps, epochs=epochs,
                              validation_data=validation_generator,validation_steps=validation_steps)

After this I'm getting this error:-

ValueError: Error when checking input: expected conv_lst_m2d_10_input to have 5 dimensions, but got array with shape (32, 600, 600, 3)

I've few queries regarding it:-

  1. How to resolve this error?
  2. How can I pass one tif as a batch? as number of image in single tif varies.

Any help is appreciable.

Thanks:)

Edit 1 :

I have created a custom generator that looks like:

class DataGenerator(Sequence):
    
    def __init__(self, list_IDs, labels, image_path, to_fit=True, batch_size=32, dim=(5,600,600),
                n_channel=1, n_classes=2, shuffle=True):
        self.list_IDs = list_IDs
        self.labels = labels
        self.image_path = image_path
        self.to_fit = to_fit
        self.batch_size = batch_size
        self.dim = dim
        self.n_channel = n_channel
        self.n_classes = n_classes
        self.shuffle = shuffle
        self.on_epoc_end()
    
    def __len__(self):
        return int(np.floor(len(self.list_IDs)/self.batch_size))
    
    def __getitem__(self,index):
        indexes = self.indexes[index * self.batch_size:(index+1)*self.batch_size]
        
        list_IDs_temp = [self.list_IDs[k] for k in indexes]
        
        X,y = self._generate_data(list_IDs_temp)
        
        return X,y
    
    def on_epoc_end(self):
        self.indexes = np.arange(len(self.list_IDs))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)
    
    def _generate_data(self, list_IDs_temp):
        X = np.empty((self.batch_size, *self.dim, self.n_channel))
        y = np.empty((self.batch_size), dtype = np.uint8) 
        
        for i, ID in enumerate(list_IDs_temp):
            X[i,] = self._load_grayscale_image(self.image_path + ID)
            y[i] = self.labels[i]
        return X, y
    
    def _load_grayscale_image(self,image_path):
        img = cv2.imread(image_path+'.png')
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        img = img / 255
        img = img[:,:,np.newaxis]
        return img

and load the data

def loadData(filepath, val_sample=0.2):
    data = pd.read_csv(filepath)
    image_IDs = data['Id'].values
    labels = data['label'].values
    X_train, X_test, Y_train, Y_test = train_test_split(image_IDs, labels, test_size=val_sample, shuffle=False)
    train_data = DataGenerator(X_train, Y_train,image_path = original_train, batch_size = batch_size, shuffle=False)
    val_data = DataGenerator(X_test,Y_test,image_path = original_train, batch_size = batch_size, shuffle=False)
    return train_data,val_data

but after getting the shape right for the model its giving:-

ValueError: Error when checking input: expected reshape_2_input to have 4 dimensions, but got array with shape (32, 5, 600, 600, 1)

Upvotes: 0

Views: 329

Answers (1)

Nicolas Gervais
Nicolas Gervais

Reputation: 36624

Like any other LSTM layer, ConvLSTM2D needs a time-steps dimension. So the whole shape should be:

(n_samples, time_steps, height, width, channels)

Since it's difficult to add a dimension when using ImageDataGenerator, I suggest you reshape your data when it enters the neural network:

model.add(Reshape((1,) + input_shape, input_shape=input_shape))

Copy/pastable example:

from tensorflow.keras.layers import *
from tensorflow.keras import Sequential
import numpy as np

img_width, img_height = 32, 32
input_shape = (img_width, img_height, 3)
batch_size = 8

model = Sequential()
model.add(Reshape((1,) + input_shape, input_shape=input_shape))
model.add(ConvLSTM2D(filters=8, kernel_size=(3, 3),
                     activation='relu',
                     return_sequences=True,
                     padding='same',
                     input_shape=(None, img_width, img_height, 3)))
model.add(BatchNormalization())
model.add(ConvLSTM2D(8, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(8))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

model.summary()

fake_picture = np.random.rand(*((batch_size,) + input_shape)).astype(np.float32)
model(fake_picture)
<tf.Tensor: shape=(8, 2), dtype=float32, numpy=
array([[0.49504986, 0.4995347 ],
       [0.49617144, 0.5001322 ],
       [0.4947565 , 0.50097185],
       [0.49597737, 0.4996349 ],
       [0.49563733, 0.50064707],
       [0.49486715, 0.49945754],
       [0.49625823, 0.50110054],
       [0.49568254, 0.50056493]], dtype=float32)>

Upvotes: 1

Related Questions