Reputation: 71
Is there a way to vectorize this for loop to speed up?
thank you
for j =1 :size(Rond_Input2Cell,1)
for k=1: size(Rond_Input2Cell,2)
Rond_Input2Cell(j,k)= (Pre_Rond_Input2Cell(j,k)*Y_FGate(k))+(net_Cell(k)*Y_InGate(k)*tmp_input(j)) ;
end
end
P.s.
Matrix size:
Rond_Input2Cell =39*120
Pre_Rond_Input2Cell = 39*120
Y_FGate=1*120 (row vector)
net_Cell=1*120 (row vector)
Y_InGate =1*120 (row vector)
tmp_input =1*39 (row vector)
Upvotes: 2
Views: 101
Reputation: 1492
You can speed up this calculation without using a for loop
but instead using bsxfun
which uses memory to speed up the processing
This code below perform the same function row by row and adds them
Rond_Input2Cell = bsxfun(@times,tmp_input.' ,net_Cell.*Y_InGate) + bsxfun(@times ,Pre_Rond_Input2Cell,Y_FGate);
Exlpanation :
Pre_Rond_Input2Cell(j,k)*Y_FGate(k)
This is performed by using bsxfun(@times ,Pre_Rond_Input2Cell,Y_FGate)
which mutiplies each 39 rows of Pre_Rond_Input2Cell with 120 columns of Y_FGate
net_Cell(k)*Y_InGate(k)*tmp_input(j)
is replaced by bsxfun(@times,tmp_input.' ,net_Cell.*Y_InGate)
which mutiplies each element of tmp_input
with dot mutiplication of net_Cell
and Y_InGate
In the end the it is stored in Rond_Input2Cell
Here is a performance check
>> perform_check
Elapsed time is 0.000475 seconds.
Elapsed time is 0.000156 seconds.
>> perform_check
Elapsed time is 0.001089 seconds.
Elapsed time is 0.000288 seconds.
One more Method is to use repmat
tic;
Rond_Input2Cell =(Pre_Rond_Input2Cell.*repmat(Y_FGate,size(Pre_Rond_Input2Cell,1),1)) + (repmat(tmp_input.',1,size(Pre_Rond_Input2Cell,2)).*repmat(net_Cell.*Y_InGate,size(Pre_Rond_Input2Cell,1),1));
toc;
Here is a performance test with a for
loop
>> perf_test
Elapsed time is 0.003268 seconds.
Elapsed time is 0.001719 seconds.
>> perf_test
Elapsed time is 0.004211 seconds.
Elapsed time is 0.002348 seconds.
>> perf_test
Elapsed time is 0.002384 seconds.
Elapsed time is 0.000509 seconds.
Here is an article by Loren on Performance of repmat
vs bsxfun
Upvotes: 1
Reputation: 581
Your vectorized code should be something like this.
temp_mat = tmp_input' * (net_Cell .* Y_InGate)
- size (39*120)
Rond_Input2Cell = (Pre_Rond_Input2Cell .* Y_FGate) .+ temp_mat
- size (39*120)
Upvotes: 0