Gon
Gon

Reputation: 41

Semantic Segmentation on a bigger image

Output image

PS: I saw this question semantic segmentation for large images a user said it's doable, if so is there any way to make the borders more continuous?

Upvotes: 3

Views: 1034

Answers (3)

Multihunter
Multihunter

Reputation: 5948

Shai's answer only works when the model has a reasonably small receptive field. The trend in modern networks is to incorporate more global information (e.g. ViTs), which makes every pixel dependent on the exact boundaries at the input. When this happens Shai's answer is only a partial fix. You'll still get discontinuities.

Smoothly-Blend-Image-Patches (as suggested by ferlix), which I'll call SBIP here, is a nice algorithm. But their implementation only works for small images because they put everything through the model at once. It also lacks many configuration options.

The algorithm works by running the model on overlapping tiles, but instead of simply cropping the output, like in Shai's answer, where the tiles overlap, SBIP smoothly blends the logits from one tile to the other. So, at the edge of each overlapping region between tiles, the logits come entirely from one tile, and on the other edge, they come entirely from the adjacent tile.

Here's a rough explanation of the special case of using 50% overlap (maximally smooth):

Rough description of algorithm

I independently discovered this solution and made a github repo implementing it. But I also I fixed the problems with SBIP and expanded the supported cases. Most notably:

  1. You can control the RAM usage:
    1. By default, keeps minimum tiles in RAM at once. e.g. Approx 1% of a 40000x40000 image.
    2. Includes option to offload to disk if even that is too much.
    3. Can choose batch size when running model.
  2. Finer control of overlap proportion. Any whole pixel value between:
    1. Overlap = 0%. Discontinuous. Fastest.
    2. Overlap = 50%. Maximally smooth. Slowest.
  3. Let's you bring your own tiling strategy (if you want), including non-square tiles.

Upvotes: 0

ferlix
ferlix

Reputation: 71

I think this library does what you need, using interpolation with a simple second order spline window function:

https://github.com/Vooban/Smoothly-Blend-Image-Patches

It works only if your original image size is not extremely big because of memory constrains.

Upvotes: 1

Shai
Shai

Reputation: 114926

If your model is fully convolutional, you can trivially apply it to larger images. Your only limitation is your device's memory size.

If you have no way but slicing the image, you can still avoid discontinuities, but taking into account your model's receptive field:
If you crop much larger crops - that considers the true size of the receptive field - and keep only the central, "valid", output mask, you should be able to get a smooth and continuous mask.

Upvotes: 1

Related Questions