Reputation: 445
I am using an autoencoder to denoise gray-scale images of high resolution. Each image is divided into sub-batches of a particular size i.e 52 x 52
and the model works on each of these batches then the result is the concatenation of the denoised batches in each image back to the original one. Below is an example of an image after the result:
You can see the smaller image batches after concatenation. How to overcome this behavior?
I thought about doing further processing like adding blur to the edges to blend them together, however I think this is not the optimal solution.
code of concatenation:
num_hor_patch = 19
num_ver_patch = 26
print("Building the Images Batches")
for i in range(num_image):
reconstruct = []
for j in range(num_hor_patch):
from_vertical_patches = predictions[start_pos:(start_pos+num_ver_patch)]
horizontal_patch = np.concatenate(from_vertical_patches, axis=1)
start_pos += num_ver_patch
reconstruct.append(horizontal_patch)
restored_image = np.concatenate(np.array(reconstruct), axis=0)
output.append(restored_image)
start_pos = 0
test_data = np.array([np.reshape(test_data[i], (52, 52)) for i in range(test_data.shape[0])])
for i in range(num_image):
reconstruct = []
for j in range(num_hor_patch):
from_vertical_patches = test_data[start_pos:(start_pos+num_ver_patch)]
horizontal_patch = np.concatenate(from_vertical_patches, axis=1)
start_pos += num_ver_patch
reconstruct.append(horizontal_patch)
restored_image = np.concatenate(np.array(reconstruct), axis=0)
input.append(restored_image)
start_pos = 0
test_noisy_data = np.array([np.reshape(test_noisy_data[i], (52, 52)) for i in range(test_noisy_data.shape[0])])
for i in range(num_image):
reconstruct = []
for j in range(num_hor_patch):
from_vertical_patches = test_noisy_data[start_pos:(start_pos+num_ver_patch)]
horizontal_patch = np.concatenate(from_vertical_patches, axis=1)
start_pos += num_ver_patch
reconstruct.append(horizontal_patch)
restored_image = np.concatenate(np.array(reconstruct), axis=0)
noisy.append(restored_image)
print("Exporting the Model")
output_model['output'] = output
output_model['original'] = input
output_model['noisy'] = noisy
Upvotes: 4
Views: 440
Reputation: 5933
From the image looks like you are dealing with ~400x800 resolution images. For this the 16GB-32GB memory of V100 GPUs should be enough for a decent batch sizes with a decent model capacity! Even if not, then you can trim down the memory utilization by a factor of 2 to 4 using mixed-precision or even fp16.
If your resolution is too high to play these tricks, then ideally you need to implement a model parallelism approach by splitting the image patches on GPUs (spatial partitioning), and exchanging the tensors between these domains in the computational graph during forward and backward passes. This will make the patches consistent, but may result in significant performance bottlenecks!
You can take a look at this UNET architecture which you will have to convert from Conv3D to Conv2D and switch from segmentation to de-noising task.
https://github.com/tensorflow/mesh/blob/master/mesh_tensorflow/experimental/unet.py
Upvotes: 1
Reputation: 3279
So correct me if i am wrong but you have the following problem:
and the question is how to use the autoencoder to denoise the image?
Your current solution is:
This solution gives very non consistence result in the seam between the tiles.
I think you need to use an overlapping tiles and not non-overlap tiles.
This means that each pixel is part of several tiles and not just one. And therefor will have several denoised values, one from each tile that he his a part of.
The final value for each pixel will be the mean of the values from all the tiles that the pixel is part of.
The new solution:
Upvotes: 0