Reputation: 1
using im2col
sliding window in matlab i have converted the input image block into column and again by using col2im
i do the inverse process but the output is not same as the input image. How can i recover the input image? can anyone please help me.
Here is the code
in=imread('tire.tif');
[mm nn]=size(in);
m=8;n=8;
figure,imshow(in);
i1=im2col(in,[8 8],'sliding');
i2 = reshape( sum(i1),mm-m+1,nn-n+1);
out=col2im(i2,[m n],[mm nn],'sliding');
figure,imshow(out,[]);
thanks in advance...
Upvotes: 0
Views: 2858
Reputation: 1
i1
obtained from 'sliding'
option has the information that you would get from 'distinct'
option as well, which you need to filter out. Now, this may not be the best way to code it up but it works. Assume that mm
is a multiple of m
and nn
is a multiple of n
. If this is not the case, then you'll have to zero-pad accordingly to make this the case.
in=imread('tire.tif');
[mm nn]=size(in);
m=8;n=8;
i1 = im2col(in,[m,n],'sliding');
inSel = [];
for k=0:mm/m-1
inSel = [inSel 1:n:nn+(nn-n+1)*n*k];
end
out = col2im(i1(:,inSel),[m,n],[mm,nn],'distinct');
Upvotes: 0
Reputation: 125864
You didn't specify exactly what the problem is, but I see a few potential sources:
You shouldn't expect the output to be exactly the same as the input, since you're replacing each pixel value with the sum of pixels in an 8-by-8 neighborhood. Also, you will get a shrinkage of the resulting image by 7 pixels in each direction (i.e. [m-1 n-1]
) since the 'sliding'
option of IM2COL does not pad the array with zeroes to create neighborhoods for pixels near the edges.
These two lines are redundant:
i2 = reshape( sum(i1),mm-m+1,nn-n+1);
out=col2im(i2,[m n],[mm nn],'sliding');
You only need one or the other, not both:
%# Use this:
out = reshape(sum(i1),mm-m+1,nn-n+1);
%# OR this:
out = col2im(sum(i1),[m n],[mm nn],'sliding');
Image data in MATLAB is typically of type 'uint8'
, meaning each pixel is represented as an unsigned 8-bit integer spanning the range 0 to 255. Assuming this is what in
is, when you perform your sum operation you will implicitly end up converting it to type 'double'
(since an unsigned 8-bit integer will likely not be big enough to hold the sum totals). When image pixel values are represented with a double type, the pixel values are expected to span the range 0 to 1, so you will want to scale your resulting image by its maximum value to get it to display properly:
out = out./max(out(:));
Lastly, check what kind of input image you are using. For your code, you are essentially assuming in
is 2-D (i.e. a grayscale intensity image). If it is a truecolor (i.e. RGB) image, the third dimension is going to cause you some trouble, and you will have to either process each color plane separately and recombine them or convert the RGB image to grayscale. If it is an indexed image (with an associated color map), you will not be able to do the sort of processing you describe above without first converting it to a grayscale representation.
Upvotes: 3
Reputation: 1331
Why are you expecting the output to be the same?
i2 is the result of performing a SUM around a pixel neighborhood (essentially a low-pass filter), which is the final blurry image that you see. i.e you are NOT doing an inverse process with the COL2IM call.
Upvotes: 0