Is there a better vectorization technique than this?

Question

I am trying to see if there are other ways of coding this code sample more efficiently. Here, y is an 1xM matrix, (say, 1x1000), and z is an NxM matrix, (say, 5x1000).

mean(ones(N,1)*y.^3 .* z,2)

This code works fine, but I worry if N increases a lot, that the ones(N,1)*y.^3 might get too wasteful and make everything slow down.

Thoughts?

user85109 · Accepted Answer

Its not THAT terrible for a matrix that small. Many times you can gain from the use of bsxfun on problems like this. Here the matrices are simply too small to really gain anything.

>> N = 5;M =1000;
>> y = rand(1,M);
>> z = rand(N,M);
>> mean(ones(N,1)*y.^3 .* z,2)
ans =
      0.12412
      0.11669
      0.12102
      0.11976
      0.12196

>> mean(bsxfun(@times,y.^3,z),2)
ans =
      0.12412
      0.11669
      0.12102
      0.11976
      0.12196

>> z*y.'.^3/M
ans =
      0.12412
      0.11669
      0.12102
      0.11976
      0.12196

As you can see, all three solutions return the same result. All are equally valid.

Now I'll compare the times required.

>> timeit(@() mean(ones(N,1)*y.^3 .* z,2))
ans =
   0.00023018

>> timeit(@() mean(bsxfun(@times,y.^3,z),2))
ans =
   0.00026829

>> timeit(@() z*y.'.^3/M)
ans =
   0.00016594

As I said, you don't gain much. In fact, bsxfun does not gain at all, and is even a bit slower. But you can gain a bit, if you re-write the expression into the third form I've posed. Not much, but a bit.

Edit: if N is large, then the timing changes a bit.

>> N = 2000;M = 1000;
>> y = rand(1,M);
>> z = rand(N,M);
>> timeit(@() mean(ones(N,1)*y.^3 .* z,2))
ans =
     0.034664

>> timeit(@() mean(bsxfun(@times,y.^3,z),2))
ans =
     0.012234

>> timeit(@() z*y.'.^3/M)
ans =
    0.0017674

The difference is the first solution explicitly creates an expanded y.^3 matrix. This is inefficient.

The bsxfun solution is better, because it never explicitly forms the expand y.^3 matrix. But it still forms a product matrix that is N by M. So this solution still must grab and fill a large chunk of memory.

You should understand why the matrix-vector multiply is the best in all cases. No large matrix is ever formed. Since * is simply a dot product (thus a sum of products) it must be more efficient. Then I divide by M after the fact to create the desired mean.

A minor improvement over the last...

>> timeit(@() z*(y.*y.*y).'/M)
ans =
    0.0015793

which gains slightly over the power op.

And timeit? This comes from the File Exchange, a terribly useful utility written by Steve Eddins to time code fragments.

Is there a better vectorization technique than this?

Answers (1)

Related Questions