Reputation: 1691
I test the convn on two GPU: Quadro 6000 and Titan both take longer time than cpu.
A quick test can be done in matlab:
maxloop=1000;
for i=1:maxloop
output2= convn(rand(320,1), rand([6,1,300]),'full');
end
for i=1:maxloop
goutput2= convn(gpuArray.rand(320,1),gpuArray.rand([6,1,300]), 'full');
end
It takes, 0.52s on CPU, but 7s on Quadro 6000 and 15s+- on Titan.
What I had tested:
1) If change the rand input to fixed, predefined values does not give any improvement.
2) Predefine GPU output(goutput2) doesn't help so much.
Quadro
Titan
I do run the same test as the first answer:
Same result obtained when m=1000; n=100; k=5;
Elapsed time is 2.367453 seconds. %%%%GPU
Elapsed time is 27.502952 seconds. %%%%CPU
My question is what and why my own test code is running slower on GPU?
Upvotes: 2
Views: 920
Reputation: 1691
It looks like question Matlab Convolution using gpu at first glances. But after increase data size from [6,1,300] to [1000,1,1000], there is no improvement in the GPU loop. So it's not about "size" of data.
When I reshape data from [1000,1,1000] to [1000,1000], GPU runs faster and CPU runs slower than previous test. Code and timing are listed below:
clear all;
maxloop=1000;
r5=rand(5,1);
r1000=rand([1000,1,1000]); %3d array
for i=1:maxloop
cpu_output1= convn(r5, r1000,'full'); %3d cpu array
end
r1000=reshape(r1000,[1000,1000]);
for i=1:maxloop
cpu_output2= convn(r5, r1000,'full'); %2d cpu array
end
gr5=gpuArray.rand(5,1);
gr1000=gpuArray.rand([1000,1,1000]);
for i=1:maxloop
gpu_output1= convn(gr5,gr1000, 'full'); %3d gpu array
end
gr3_1000=reshape(gr1000,[1000,1000]);
for i=1:maxloop
gpu_output2= convn(gr5,gr3_1000, 'full'); %2d gpu array
end
Upvotes: 1