Luca
Luca

Reputation: 10996

computer vision: segmentation setup. Graph cut potentials

I have been trying to teach myself some simple computer vision algorithms and am trying to solve a problem where I have some noise corrupted image and all I am trying to do is separate the black background from the foreground which has some signal. Now, the background RGB channels are not all completely zero as they can have some noise. However, the human eye can easily discern the foreground from the background.

So, what I did was use the SLIC algorithm to break the image down into super pixels. The idea being that since the image is noise corrupted, doing statistics on the patches might result in better classification of background and foreground because of higher SNR.

After this, I get around 100 patches which should have similar profile and the result of SLIC seems reasonable. I have been reading about graph cuts (the Kolmogorov paper) and it seemed like something nice to try for the binary problem I have. So, I constructed a graph which is a first order MRF and I have edges between the immediate neighbours (4-connected graph).

Now, I was wondering what possible unary and binary terms I can use here to do my segmentation. So, I was thinking for the unary term, I can model it as a simple Gaussian where the background should have a zero mean intensity and the foreground should have some non-zero mean. Although, I am struggling to figure out how to encode this. Should I just assume some noise variance and compute probabilities directly using patch statistics?

Similarly, for neighbouring patches I do want to encourage them to take similar label but I am not sure what binary term I can design that reflects that. Seems just the difference between the label (1 or 0) seems weird...

Sorry for the long-winded question. Hoping someone can give some helpful hint on how to start.

Upvotes: 0

Views: 253

Answers (1)

Miles
Miles

Reputation: 2527

You could build your CRF model over superpixels, such that a superpixel has a connection to another superpixel if it is a neighbour of it.

For your statistical model Pixel Wise Posteriors are simple and cheap to compute.

So, I suggest the following for the unary terms of the CRF:

  1. Build foreground and background histograms over texture per pixel(assuming you have a mask, or reasonable amount of marked foreground pixels(note, not superpixels)).
  2. For each superpixel, make an independence assumption over pixels within it, such that a superpixels likelihood of being either foreground or background is the product over each observation in the superpixel(in practice, we sum logs). The individual likelihood terms come from the histograms that you generated.
  3. Compute the posterior for foreground as the cumulative likelihood described above for foreground divided by the sum of the cumulative likelihoods of both. Similar for background.

The pairwise terms between superpixels can be as simple as the difference between the mean observed textures(pixelwise) for each passed through a kernel, such as the Radial Basis Function.

Alternatively, you could compute histograms over each superpixels observed texture(again, pixel wise) and compute the Bhattacharyya Distance between each neighbouring pair of superpixels.

Upvotes: 0

Related Questions