A.Mani
A.Mani

Reputation: 71

Is there a way to vectorize this for loop

Is there a way to vectorize this for loop to speed up?

thank you

        for j =1 :size(Rond_Input2Cell,1)

            for k=1: size(Rond_Input2Cell,2)

                Rond_Input2Cell(j,k)=  (Pre_Rond_Input2Cell(j,k)*Y_FGate(k))+(net_Cell(k)*Y_InGate(k)*tmp_input(j)) ;

            end
        end

P.s.

Matrix size:

Rond_Input2Cell =39*120

Pre_Rond_Input2Cell = 39*120

Y_FGate=1*120 (row vector)

net_Cell=1*120 (row vector)

Y_InGate =1*120 (row vector)

tmp_input =1*39 (row vector)

Upvotes: 2

Views: 101

Answers (2)

Novice_Developer
Novice_Developer

Reputation: 1492

You can speed up this calculation without using a for loop but instead using bsxfun which uses memory to speed up the processing This code below perform the same function row by row and adds them

 Rond_Input2Cell = bsxfun(@times,tmp_input.' ,net_Cell.*Y_InGate) +  bsxfun(@times ,Pre_Rond_Input2Cell,Y_FGate);

Exlpanation :

Pre_Rond_Input2Cell(j,k)*Y_FGate(k)

This is performed by using bsxfun(@times ,Pre_Rond_Input2Cell,Y_FGate) which mutiplies each 39 rows of Pre_Rond_Input2Cell with 120 columns of Y_FGate

net_Cell(k)*Y_InGate(k)*tmp_input(j) is replaced by bsxfun(@times,tmp_input.' ,net_Cell.*Y_InGate) which mutiplies each element of tmp_input with dot mutiplication of net_Cell and Y_InGateIn the end the it is stored in Rond_Input2Cell

Here is a performance check

>> perform_check
Elapsed time is 0.000475 seconds.
Elapsed time is 0.000156 seconds.
>> perform_check
Elapsed time is 0.001089 seconds.
Elapsed time is 0.000288 seconds.

One more Method is to use repmat

tic;
    Rond_Input2Cell =(Pre_Rond_Input2Cell.*repmat(Y_FGate,size(Pre_Rond_Input2Cell,1),1)) + (repmat(tmp_input.',1,size(Pre_Rond_Input2Cell,2)).*repmat(net_Cell.*Y_InGate,size(Pre_Rond_Input2Cell,1),1));
    toc;

Here is a performance test with a for loop

>> perf_test
Elapsed time is 0.003268 seconds.
Elapsed time is 0.001719 seconds.
>> perf_test
Elapsed time is 0.004211 seconds.
Elapsed time is 0.002348 seconds.
>> perf_test
Elapsed time is 0.002384 seconds.
Elapsed time is 0.000509 seconds.

Here is an article by Loren on Performance of repmat vs bsxfun

Upvotes: 1

shadowfox
shadowfox

Reputation: 581

Your vectorized code should be something like this.

temp_mat = tmp_input' * (net_Cell .* Y_InGate) - size (39*120)

Rond_Input2Cell = (Pre_Rond_Input2Cell .* Y_FGate) .+ temp_mat - size (39*120)

Upvotes: 0

Related Questions