casparjespersen
casparjespersen

Reputation: 3840

Optimization/vectorization of Matlab algorithm including very big matrices

I have a optimization problem in Matlab. Assume I get the following three vectors as input:

Now, the elements of F might not (99% of the times they will not) necessarily match the items of fx exactly, which is why I have to match to the closest frequency.

Here's the catch: We are talking about big data. N can easily be up to 2 million, and this has to be run hundred times on several hundred subjects. My two concerns:

  1. Time (main concern)
  2. Memory (production will be run on machines with +16GB memory, but development is on a machine with only 8GB of memory)

I have these two working solutions. For the following, N=2604000 and M=201:

Method 1 (for-loop)

Simple for-loop. Memory is no problem at all, but it is time consuming. Easiest implementation.

tic;
I = zeros(M,N);
for i = 1:N
   [~,f] = min(abs(fx-F(i)));
   I(f,i) = A(i).^2;
end
toc;

Duration: 18.082 seconds.

Method 2 (vectorized)

The idea is to match the frequency axis with each instantaneous frequency, to get the id.

             F
        [ 0.9 0.2 2.3 1.4 ] N
   [ 0 ][  0   1   0   0  ]
fx [ 1 ][  1   0   0   1  ]
   [ 2 ][  0   0   1   0  ]
     M

And then multiply each column with the amplitude at that time.

tic;
m_Ff = repmat(F,M,1);
m_fF = repmat(fx,1,N);
[~,idx] = min(abs(m_Ff - m_fF));  clearvars m_Ff m_fF;
m_if = repmat(idx,M,1);           clearvars idx;
m_fi = repmat((1:M)',1,N);
I = double(m_if==m_fi);           clearvars m_if m_fi;
I = bsxfun(@times,I,A);
toc;

Duration: 64.223 seconds. This is surprising to me, but probably because the huge variable sizes and my limited memory forces it to store the variables as files. I have SSD, though.

The only thing I have not taken advantage of, is that the matrices will have many zero-elements. I will try and look into sparse matrices.

I need at least single precision for both the amplitudes and frequencies, but really I found that it takes a lot of time to convert from double to single.

Any suggestions on how to improve?

UPDATE

As of the suggestions, I am now down to a time of combined 2.53 seconds. This takes advantage of the fact that fx is monotonically increasing and even-spaced (always starting in 0). Here is the code:

tic; df = mode(diff(fx)); toc;       % Find fx step size
tic; idx = round(F./df+1); doc;      % Convert to bin ids
tic; I = zeros(M,N); toc;            % Pre-allocate output
tic; lin_idx = idx + (0:N-1)*M; toc; % Find indices to insert at
tic; I(lin_idx) = A.^2; toc;         % Insert

The timing outputs are the following:

Elapsed time is 0.000935 seconds.
Elapsed time is 0.021878 seconds.
Elapsed time is 0.175729 seconds.
Elapsed time is 0.018815 seconds.
Elapsed time is 2.294869 seconds.

Hence the most time-consuming step is now the very final one. Any advice on this is greatly appreciated. Thanks to @Peter and @Divakar for getting me this far.

UPDATE 2 (Solution)

Wuhuu. Using sparse(i,j,k) really improves the outcome;

tic; df = fx(2)-fx(1); toc;
tic; idx = round(F./df+1); toc;
tic; I = sparse(idx,1:N,A.^2); toc;

With timings:

Elapsed time is 0.000006 seconds.
Elapsed time is 0.016213 seconds.
Elapsed time is 0.114768 seconds.

Upvotes: 1

Views: 56

Answers (2)

Peter
Peter

Reputation: 14937

I'm not following entirely the relationship of F and fx, but it sounds like fx might be a set of bins of frequency, and you want to find the appropriate bin for each input F.

Optimizing this depends on the characteristics of fx.

If fx is monotonic and evenly spaced, then you don't need to search it at all. You just need to scale and offset F to align the scales, then round to get the bin number.

If fx is monotonic (sorted) but not evenly spaced, you want histc. This will use an efficient search on the edges of fx to find the correct bin. You probably need to transform f first so that it contains the edges of the bins rather than the centers.

If it's neither, then you should be able to at least sort it to get it monotonic, storing the sort order, and restoring the original order once you've found the correct "bin".

Upvotes: 2

Divakar
Divakar

Reputation: 221574

Here's one approach based on bsxfun -

abs_diff = abs(bsxfun(@minus,fx,F));
[~,idx] = min(abs_diff,[],1);

IOut = zeros(M,N);
lin_idx = idx + [0:N-1]*M;
IOut(lin_idx) = A.^2;

Upvotes: 2

Related Questions