Reputation: 886
This one's a bit more of a historical question, but I don't know where else would be better than here.
Title's fairly self-explanatory - when was im2col
first used for CNNs? From my scouring of the internet, the earliest I can date im2col
itself is at least 2006: MathWorks claims the im2col
function was provided at some point before the R2006a
release on the documentation for im2col
. It does not give any further specifics. Searching MathWorks' release history gives no further clues.
I'm more just curious than anything else - does anyone else have any idea?
Upvotes: 1
Views: 541
Reputation: 886
So I think this is an interesting question that deserves to remain. While it's debated whether it stays, I'll post my findings here.
It appears the earliest known reference to 'unrolling' convolutional operations into matrix-multiplies for CNNs specifically, was in 'High Performance Convolutional Neural Networks for Document Processing', by several Microsoft researchers way back in 2006. It is very clear from the figures provided that this is the im2col
transform, although it was not specifically named as such - just 'unrolling'. Some of their terminology is also a bit old-fashioned by today's standards - strides are referred to in that paper as sub-sampling, so it seems pretty churlish to deny them the accolade based on not using the right name. Caffe developers wrongly claim that using im2col
is the 'Caffe trick' - it is clearly not unique to Caffe, they didn't come up with it, and in fact this paper is referenced on their site.
It's worth noting that this was by no means the first usage of mat-muls for speeding up convolutions. The paper above specifically states "Simple unfolding of convolution is a well known technique. It is commonly implemented in signal processing and communications applications." It's pretty clear that the first usage of mat-muls for convolutions is likely to be much, much older than this, although the history on that remains a bit murky.
gemm
as a routine was only formally introduced with BLAS level 3 in 1990, so it's possible that convolutions using mat-muls would have picked up considerably from then onwards due to the portability provided by BLAS. Of course, it's entirely possible - and likely - that earlier implementations would have used dedicated, non-standard matrix-multiply routines.
Upvotes: 1