Denoising Autoencoder for Images with large shape

I want to create a denoising autoencoder for images of any shape. Most of the solutions out there have image shape not greater than (500,500) while the images I have are document scans of shape (3000,2000). I tried to reshape the images and build the model, but the predictions are incorrect. Could someone help me?

I have tried to build model with the code here https://github.com/mrdragonbear/Autoencoders/blob/master/Autoencoder-Tutorial.ipynb, playing around the image shape but the predictions fails.

Upvotes: 2

Answers (1)

Yehya Chali

Reputation: 26

I have a document denoiser already. There is no need to have a model for a large shape, you can simply split them, feed them to the model, and then merge the predicted chunks together again. My model accepts images of shape 512x512, so I have to split the images by 512x512 chunks. The images must be larger than or equal to 512x512. If the image is smaller then all you need is to resize it or fit it in a 512x512 shape.

def split_page(page):
    chunk_size = (512, 512)
    main_size = page.shape[:2]
    chunks=[]
    chunk_grid = tuple(np.array(main_size)//np.array(chunk_size))
    extra_chunk = tuple(np.array(main_size)%np.array(chunk_size))
    for yc in range(chunk_grid[0]):
        row = []
        for xc in range(chunk_grid[1]):
            chunk = page[yc*chunk_size[0]:yc*chunk_size[0]+chunk_size[0], xc*chunk_size[1]: xc*chunk_size[1]+chunk_size[1]]
            row.append(chunk)
        if extra_chunk[1]:
            chunk = page[yc*chunk_size[0]:yc*chunk_size[0]+chunk_size[0], page.shape[1]-chunk_size[1]:page.shape[1]]
            row.append(chunk)
        chunks.append(row)
    if extra_chunk[0]:
        row = []
        for xc in range(chunk_grid[1]):
            chunk = page[page.shape[0]-chunk_size[0]:page.shape[0], xc*chunk_size[1]: xc*chunk_size[1]+chunk_size[1]]
            row.append(chunk)
        
        if extra_chunk[1]:
            chunk = page[page.shape[0]-chunk_size[0]:page.shape[0], page.shape[1]-chunk_size[1]:page.shape[1]]
            row.append(chunk)
        chunks.append(row)
        
    return chunks, page.shape[:2]

def merge_chunks(chunks, osize):
    extra = np.array(osize)%512
    page = np.ones(osize)
    for i, row in enumerate(chunks[:-1]):
        for j, chunk in enumerate(row[:-1]):
            page[i*512:i*512+512,j*512:j*512+512]=chunk
        page[i*512:i*512+512,osize[1]-512:osize[1]]=chunks[i,-1]

    if extra[0]:
        for j, chunk in enumerate(chunks[-1][:-1]):
            page[osize[0]-512:osize[0],j*512:j*512+512]=chunk
        page[osize[0]-512:osize[0],osize[1]-512:osize[1]]=chunks[-1,-1]

    else:
        for j, chunk in enumerate(chunks[-1][:-1]):
            page[osize[0]-512:osize[0],j*512:j*512+512]=chunk
        page[osize[0]-512:osize[0],osize[1]-512:osize[1]]=chunks[-1,-1]
        
    
    return page

def denoise(chunk):
    chunk = chunk.reshape(1,512,512,1)/255.
    denoised = model.predict(chunk).reshape(512,512)*255.
    return denoised

def denoise_page(page):
    chunks, osize= split_page(page)
    chunks = np.array(chunks)
    denoised_chunks = np.ones(chunks.shape)
    for i, row in enumerate(chunks):
        for j, chunk in enumerate(row):
            denoised = denoise(chunk)
            denoised_chunks[i][j]=denoised
    denoised_page = merge_chunks(denoised_chunks, osize)
    
    return denoised_page

Upvotes: 1

Denoising Autoencoder for Images with large shape

Answers (1)

Related Questions