Reputation: 449
I am trying to write my own im2col algorithm for input dimensions > 2D. Currently I am looking at the Matlab im2col implementation. However, I cannot find any documentation regarding what is going on for any input of more than 2 dimensions.
I do get an output if I feed in a 3D tensor into the function. However I don't really understand how you get from 2D to ND. The fact that this isn't mentioned in the documentation suggests that its something straightforward, still, I don't get it.
Heck, I dont even understand why the size of the output matrix is the size it is.
Upvotes: 1
Views: 1439
Reputation: 124563
Let me just start by saying that im2col
is only intended for 2D matrices. The fact that it sometimes worked (and by that I mean returned a result without throwing an error) is just a happy coincidence.
Now I took a peek at edit im2col.m
, and without studying the code too much, the first line of each of the distinct
and sliding
methods should give you an intuition of what's happening:
...
if strcmp(kind, 'distinct')
[m,n] = size(a);
...
elseif strcmp(kind,'sliding')
[ma,na] = size(a);
...
end
...
First recall that [s1,s2] = size(arr)
where arr
is a 3d array will collapse the size of 2nd and 3rd dimension into one size. Here's the relevant doc size
:
[d1,d2,d3,...,dn] = size(X)
returns the sizes of the dimensions of the arrayX
, provided the number of output argumentsn
equalsndims(X)
. Ifn < ndims(X)
,di
equals the size of the ith dimension ofX
for0<i<n
, butdn
equals the product of the sizes of the remaining dimensions ofX
, that is, dimensionsn
throughndims(X)
.
So basically for an array of size M-by-N-by-P
, the function instead thinks it's a matrix of size M-by-(N*P)
. Now MATLAB has some quirky indexing rules that lets you do things like:
>> x = reshape(1:4*3*2,4,3,2)
x(:,:,1) =
1 5 9
2 6 10
3 7 11
4 8 12
x(:,:,2) =
13 17 21
14 18 22
15 19 23
16 20 24
>> x(:,:)
ans =
1 5 9 13 17 21
2 6 10 14 18 22
3 7 11 15 19 23
4 8 12 16 20 24
which is what I think ended up happening here. Here is an example to confirm the behavior of im2col
on an RGB image:
% normal case (grayscale image)
>> M = magic(5);
>> B1 = im2col(M, [3 3], 'sliding');
% (RGB image)
>> MM = cat(3, M, M+50, M+100);
>> B2 = im2col(MM, [3 3], 'sliding');
>> B3 = im2col(reshape(MM, [5 5*3]), [3 3], 'sliding');
>> assert(isequal(B2,B3))
Note that B2
and B3
are equal, so basically think of the result of im2col
on an array arr = cat(3,R,G,B)
to be the same as that of arr = cat(2,R,G,B)
(concatenated horizontally).
Interestingly, you won't get so lucky with "distinct" blocks method:
>> B1 = im2col(M, [3 3], 'distinct') % works
% ..snip..
>> B2 = im2col(MM, [3 3], 'distinct') % errors
Subscripted assignment dimension mismatch.
Error in im2col (line 59)
aa(1:m,1:n) = a;
Now that we understand what was happening, let's think how to do this properly for 3D arrays.
In my opinion to implement im2col
for color images, I would just run it on each color channel separately (each being a 2d matrix), and concatenate the result along the third dimension. So something like this wrapper function:
function B = im2col_rgb(img, sz, varargin)
B = cell(1,size(img,3));
for i=1:size(img,3)
B{i} = im2col(img(:,:,i), sz, varargin{:});
end
B = cat(3, B{:});
end
Upvotes: 1