Neviem
Neviem

Reputation: 107

How to split image into patches/sub-images in keras/tensorflow?

I am trying to recreate the logic from this paper. The logic can be summarised in the following diagram: enter image description here

Highlighting my problem:

Working code:

from keras.applications.densenet import DenseNet201
from keras.layers import Dense, Flatten, Concatenate
from keras.activations import relu

#main images
in1 = tf.keras.Input(shape=(256,256,3))

#4 sub patches of main image
patch1 = tf.keras.Input(shape=(128,128,3))
patch2 = tf.keras.Input(shape=(128,128,3))
patch3 = tf.keras.Input(shape=(128,128,3))
patch4 = tf.keras.Input(shape=(128,128,3))

# CNN 
cnn = DenseNet201(include_top=False, pooling='avg')

#output of full 256x256
out1 = cnn(in1)

#output of 4 128x128 patches
path_out1 = cnn(patch1)
path_out2 = cnn(patch2)
path_out3 = cnn(patch3)
path_out4 = cnn(patch4)

#average patches
patch_out_average = tf.keras.layers.Average()([path_out1, path_out2, path_out3, path_out4])

#combine features
out_combined = tf.stack([out1, patch_out_average])

My question: is there a way to make this more elegant and less manual? I don't want to generate 16 rows of inputs for the 16x64x64 manually. Is there a way to 'patch' the image into sections and return an averaged tensor or just to make this less long?

Thanks.

UPDATE (using code from answer below):

from keras.applications.densenet import DenseNet201
from keras.layers import Dense, Flatten, Concatenate
from keras.activations import relu

class CreatePatches(tf.keras.layers.Layer):

    def __init__(self , patch_size, cnn):
        super(CreatePatches , self).__init__()
        self.patch_size = patch_size
        self.cnn = cnn

    def call(self, inputs):
        patches = []
        #For square images only (as inputs.shape[1] = inputs.shape[2])
        input_image_size = inputs.shape[1]
        for i in range(0 ,input_image_size , self.patch_size):
            for j in range(0 ,input_image_size , self.patch_size):
                patches.append(self.cnn(inputs[ : , i : i + self.patch_size , j : j + self.patch_size , : ]))
        return patches

#main image
in1 = tf.keras.Input(shape=(256,256,3))

# CNN 
cnn = DenseNet201(include_top=False, pooling='avg')

#output of full 256x256
out256 = cnn(in1)

#output of 4 128x128 patches
out128 = CreatePatches(patch_size=128, cnn = cnn)(in1)

#output of 16 64x64 patches
out64 = CreatePatches(patch_size=64, cnn = cnn)(in1)

#average patches
out128 = tf.keras.layers.Average()(out128)
out64 = tf.keras.layers.Average()(out64)

#combine features
out_combined = tf.stack([out256, out128, out64], axis = 1)

#average
out_averaged = tf.keras.layers.GlobalAveragePooling1D()(out_combined)

out_averaged

Upvotes: 1

Views: 4656

Answers (1)

Shubham Panchal
Shubham Panchal

Reputation: 4289

Update ( 16th July 2021 )

I found this code from the Keras tutorial of Vision Transformers, where a custom Keras layer is implemented to create patches from images using tf.image.extract_patches function.

class Patches(layers.Layer):
    def __init__(self, patch_size):
        super(Patches, self).__init__()
        self.patch_size = patch_size

    def call(self, images):
        batch_size = tf.shape(images)[0]
        patches = tf.image.extract_patches(
            images=images,
            sizes=[1, self.patch_size, self.patch_size, 1],
            strides=[1, self.patch_size, self.patch_size, 1],
            rates=[1, 1, 1, 1],
            padding="VALID",
        )
        patch_dims = patches.shape[-1]
        patches = tf.reshape(patches, [batch_size, -1, patch_dims])
        return patches

Existing solution

You can create a custom Keras Layer which can split the given square image ( width = height ) into patches, like this,

class CreatePatches( tf.keras.layers.Layer ):

  def __init__( self , patch_size ):
    super( CreatePatches , self ).__init__()
    self.patch_size = patch_size

  def call(self, inputs ):
    patches = []
    # For square images only ( as inputs.shape[ 1 ] = inputs.shape[ 2 ] )
    input_image_size = inputs.shape[ 1 ]
    for i in range( 0 , input_image_size , self.patch_size ):
        for j in range( 0 , input_image_size , self.patch_size ):
            patches.append( inputs[ : , i : i + self.patch_size , j : j + self.patch_size , : ] )
    return patches

sample_image = np.random.rand( 1 , 256 , 256 , 3 ) 
layer = CreatePatches( 128 )
layer( sample_image )

Just make sure that inputs.shape[ 1 ] is perfectly divisible by patch_size.

You can also include this layer in a Model, like,

inputs = tf.keras.layers.Input( shape=( 256 , 256 , 3 ) ) 
patches = CreatePatches( patch_size=128 )( inputs )
model = tf.keras.models.Model( inputs , patches )
model.summary()

The output of the above snippet,

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         [(None, 256, 256, 3)]     0         
_________________________________________________________________
create_patches_5 (CreatePatc [(None, 128, 128, 3), (No 0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

For more details on the model's outputs,

>> model.outputs

[<KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>,
 <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>,
 <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>,
 <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>]

Upvotes: 1

Related Questions