Reputation: 53
I have a image dataset of 4644 color images which i reshape to patches of size 50 x 50 and pass to my deep neural network.
The total no of patches which gets generated are 369765. I am using tf.data input pipeline for patches_generation.
My question is how to efficiently shuffle the patches before passing to network.
Is buffer size = 10000 in shuffle operation sufficient enough before passing to network or is there any other efficient way to shuffle among 369765 patches?
Steps that i followed: 1. Created a single tf-record which stores all 4644 images. 2. Use tf.data pipeline to decode each image and create patches from it. 3. Shuffle every 10000 patches and pass to network.
This is the code that i am using: I am using buffer_size=10000, parallel_calls=4
dataset = (tf.data.TFRecordDataset( tfrecords_filename_image )
.repeat( no_epochs )
.map( read_and_decode, num_parallel_calls=num_parallel_calls )
.map( get_patches_fn, num_parallel_calls=num_parallel_calls )
.apply( tf.data.experimental.unbatch()) # unbatch the patches we just produced
.shuffle( buffer_size=buffer_size, seed=random_number_1 )
.batch( batch_size )
.prefetch( 1 )
)
get_patches_function definition:
get_patches_fn = lambda image: get_patches( image, patch_size=patch_size )
def get_patches( image, patch_size=16 ):
# Function to compute patches for given image
# Input- image - Image which has to be converted to patches
# patch_size- size of each patch
# Output-patches of image(4d Tensor)
# with tf.device('/cpu:0'):
pad = [ [ 0, 0 ], [ 0, 0 ] ]
patches_image = tf.space_to_batch_nd( [ image ], [ patch_size, patch_size ], pad )
patches_image = tf.split( patches_image, patch_size * patch_size, 0 )
patches_image = tf.stack( patches_image, 3 )
patches_image = tf.reshape( patches_image, [ -1, patch_size, patch_size, 3 ] )
)
return patches_image
read and decode function definition:
def read_and_decode( tf_record_file ):
# Function to read the tensorflow record and return image suitable for patching
# Input: tf_record_file - tf record file in which image can be extracted
# Output: Image
features = {
'height': tf.FixedLenFeature( [ ], tf.int64 ),
'width': tf.FixedLenFeature( [ ], tf.int64 ),
'image_raw': tf.FixedLenFeature( [ ], tf.string )
}
parsed = tf.parse_single_example( tf_record_file, features )
image = tf.decode_raw( parsed[ 'image_raw' ], tf.uint8 )
height = tf.cast( parsed[ 'height' ], tf.int32 )
width = tf.cast( parsed[ 'width' ], tf.int32 )
image_shape = tf.stack( [ height, width, -1 ] )
image = tf.reshape( image, image_shape )
image = image[ :, :, :3 ]
image = tf.cast( image, tf.float32 )
return image
Please also suggest if it's better to create separate tf-records for each images rather than a single tf-record for all images .
Thanks in Advance.
Upvotes: 2
Views: 544
Reputation: 5206
A single tf-record file for all images is probably good enough given the number of images you have. If you have multiple disks you can try to split the file into one file per disk for higher throughput, but I don't think this should substantially slow a pipeline with the size of yours.
Re the shuffle buffer size, that's an empirical question. A shuffle buffer as big as the dataset will give you true IID sampling; a smaller shuffle buffer will approximate it. Usually more randomness is better, but up to a point, so I recommend trying out a few different buffer sizes (assuming you can't have a buffer which fits the entire dataset) and see what works for you.
Upvotes: 1