maxwell
maxwell

Reputation: 121

How to sum part of a matrix in Matlab without creating a submatrix?

So in Matlab, say I have a matrix X with size N by N, and i is an logical index vector with size 1 by N. Then I can do

sum(X(i,i))

The problem is it is equivalent to first allocating memory for

Y=X(i,i),

then computing the sum on Y, and deleting Y. Am I right? (Hoki's answer shows that it is right.)

Is there a faster way to compute the sum without (implicitly) creating Y? In case that Y is big, lots of time can be consumed in memory operations. In other words, is it possible to do something like the following:

S=zeros(1,nnz(i));

for k=find(i)
    for j=find(i)
        S(k)=S(k)+X(j,k);
    end
end

In this way, all the memory we need aside from X is a vector S - we don't need to allocate memory for the big Y. Of course, the loop can be slow, but you get my idea.

Upvotes: 0

Views: 852

Answers (2)

Hoki
Hoki

Reputation: 11792

You assume too much on how memory management is operated.

Timing:

I ran a benchmark with timeit. From N=10 to N=20000 there is absolutely no noticeable difference in the execution time of both forms.

Moreover, I still get the very same results if I turn the JIT acceleration off ... so the optimization may be just the result of Matlab lazy-copy behaviour.

Memory usage:

On the memory side, there seem to be a difference though. The indirect method (with the temporary variable) seem to allocate the memory for this temporary variable (The size allocated correspond exactly to the size of the temporary variable). On the other hand, the direct method does not need any extra memory allocation to return a result.

This reaches the limits of my grasp on these things. I am not expert enough to pretend to explain why this difference in memory usage does not result in a timing difference. I know memory is fast but for high order of N I thought it would have made a difference. Apparently not ...

More info:

For more details on Matlab memory management, I invite you to read this article from Loren at Matlab:
Memory Management for Functions and Variables

or if you want to read a more in-depth testing of the mechanism:
Internal Matlab memory optimizations


Time benchmark:

Benchmark result:

exectime

Benchmark code:

function ExecTimes = benchmark_sumcol

%// prepare logarithmic progression (up to what my 16GB RAM can take)
nOrder = (1:9).' * 10.^(1:3) ; nOrder = [nOrder(:) ; 10000 ; 20000] ; %'
npt = numel(nOrder) ;

ExecTimes = zeros( npt , 2 ) ;

for k = 1:npt
    %// Sample data
    N = nOrder(k) ;
    X = rand(N) ;
    ci = logical(randi([0 1],1,N)) ;

    %// Benchmark
    f1 = @() direct_sum(X,ci) ;
    f2 = @() indirect_sum(X,ci) ;
    ExecTimes(k,1) = timeit( f1 ) ;
    ExecTimes(k,2) = timeit( f2 ) ;

    clear X ci
    disp(N)
end

function R = direct_sum(X,ci)
R = sum(X(:,ci)) ;

function R = indirect_sum(X,ci)
Y = X(:,ci) ;
R = sum(Y) ;

Memory benchmark:

  • Summary for both functions

both

  • Detail for indirect summation, with temporary variable. I highlighted the memory allocation:

both

  • Detail for direct summation:

both

Code for memory benchmark

%% // set profiler options
clear all
profile('-memory','on');
setpref('profiler','showJitLines',1);
profile on

%% // sample data
N = 1000 ;
X = rand(N) ;
ci = logical(randi([0 1],1,N)) ;

%% // Benchmark
R2 = bench_indirect_sum(X,ci) ;
R1 = bench_direct_sum(X,ci) ;

%% // result
profile viewer
p = profile('info');
profsave(p,'profile_results')

Last Edit:

I added your loop version to the tests, although I had to rework it a bit to make it actually work (and give the same results than the others):

function R = bench_loop_sum(X,ci)
    R = zeros(1,nnz(ci));
    idxRes=1 ;
    for k=find(ci)
         for j=1:size(X,1)
             R(idxRes)=R(idxRes)+X(j,k);
         end
        idxRes = idxRes+1 ;
    end

The result is ok in terms of memory (that is no extra memory allocation for a temporary array), but is catastrophic in terms of speed: loopjiton

And as we could expect with loops, even worse with the JIT off: loopjitoff


Now a simple change to suppress the inner loop makes things a lot better, but still a bit behind the direct way (note that this version do not allocate memory for a temporary column to do the sum):

function R = bench_loop_sum(X,ci)
    R = zeros(1,nnz(ci));
    idxRes=1 ;
    for k=find(ci)
        R(idxRes) = sum(X(:,k));
        idxRes = idxRes+1 ;
    end

loopshortjiton
with the JIT on.

Upvotes: 3

vish
vish

Reputation: 1056

There are two answers , if you will always be looking for complete columns, the answer is simple

t=sum(X);

is a row with the sum of all the columns

then

ans=sum(t(i)) 

is what you want.

if you might be looking for odd shapes linear indices might be what you are looking for.

See sub2ind

First create a linear index into the matrix (a 1D indexing) then use that index directly

usage sum over six items (5 through 10) in columns i

ind = sub2ind(size(X) , ones(6,1)* i , (5:10)'*ones(1,N))  ; 
sum(X(ind))

Upvotes: 1

Related Questions