Reputation: 1454
Hi all I am working on Image processing and have written a short piece of code in MATLAB. The code is quite slow.
I am giving my code snippet here
for i=1:10
//find c1,c2,c3
//c1 c2 and c3 change at each iteration
u = (1./((abs(P-c1))^m) + 1./((abs(P-c2))^m) + 1./((abs(P-c3))^m));
u1 = 1./((abs(P-c1))^m)./u;
u2 = 1./((abs(P-c2))^m)./u;
u3 = 1./((abs(P-c3))^m)./u;
end
Let me explain the variables here:
P,u,u1,u2 and u3 are all matrices of size 512x512
c1,c2 and c3 are constants of dimension 1x1
m is a constant with value = 2
I want to repeat this operations in a loop (say 10 times). However my code is quite slow.
The results of the profiler are given below :
The total running time of the program was 4.6 secs. However the four steps listed above itself takes abour 80% of the time.
So I wanted to make my code run faster. MY FIRST EDIT
My changed code snippet
for i=1:10
//find c1 and c2
//c1 and c2 changes at each iteration
a=((abs(P-c1))^m);
b=((abs(P-c2))^m);
c=((abs(P-c3))^m);
x=1./a; y=1./b; z=1./c;
u = (x + y + z);
u1 = x./u;
u2 = y./u;
u3 = z./u;
end
Now the program computes in 2.47 seconds computation time for the above steps are given below:
So this is way much more faster than my first method.
2nd edit for i=1:10 //find c1,c2,c3 //c1 c2 and c3 change at each iteration
a=(P-c1).*(P-c1);
b=(P-c2).*(P-c2);
c=(P-c3).*(P-c3);
x=1./a; y=1./b; z=1./c;
u = (x + y + z);
u1 = x./u;
u2 = y./u;
u3 = z./u;
end
Now the program computes in 0.808 seconds.
The four steps described above computes above very quickly.
I am sure it can be made even faster. Can you guys please help me to further optimize my code. It would be extremely helpful for matrices larger size than 512 such as 1024 , 2048 or likewise.
Thanks in advance.
Upvotes: 4
Views: 89
Reputation: 6434
The first suggestion is:
if m = 2 and it is not changing, why you don't try these alternatives:
A*A
and if m = 2 then do you really need abs ?
this part that you are doing
1./a
is faster than
a.^(-1)
so I don't see any better option in this part.
Another thing you can try is this. instead of:
x=1./a; y=1./b; z=1./c;
u = (x + y + z);
u1 = x./u;
u2 = y./u;
u3 = z./u;
You can have this:
u = (x + y + z);
u1 = 1./(a.*u);
u2 = 1./(b.*u);
u3 = 1./(c.*u);
this way I guess it is a little bit faster by removing 3 variables. but the code becomes less readable.
Upvotes: 1
Reputation: 7186
Your current code is:
a=((abs(P-c1))^m);
b=((abs(P-c2))^m);
c=((abs(P-c3))^m);
x=1./a; y=1./b; z=1./c;
u = (x + y + z);
u1 = x./u;
u2 = y./u;
u3 = z./u;
Firstly, realize that the absolute value function is multiplicative. So |AB| = |A|x|B|
. Now, abs(P-C1)^m
is equivalent to abs( (P-C1)^m )
.
Just a preliminary glance at it suggests that some of the computation in the bottleneck can be reused. Specifically, since c1
,c2
and c3
are constants, the computation can be sped up a little bit if you try to reuse them (at the expense of additional memory).
temp_P2 = P*P;
temp_PCA = P*ones(size(P));
temp_PCB = ones(size(P))*P;
a = abs(temp_P2 - c1*temp_PCA - c1*temp_PCB + c1^2 * length(P))
The computation of temp_PCA
and temp_PCB
can also be avoided since multiplication by a constant matrix always amounts to the construction of a rank 1 matrix with either constant rows or columns.
I don't claim that any of these modifications will speed up your code but they are definitely worth trying.
Upvotes: 1