Index an array with a set of start and end indices

I have two arrays:

timesteps = [1,3;5,7;9,10];
data = [1,2,3,4,5,6,7,8,9,10];

The values in the timesteps array are describing which values of data I want. The first column where it starts, and the second where it ends.

e.g. here I want to get [1,2,3,5,6,7,9,10].

So this code works fine for me, but its very slow because of the for loop... Is there a one liner in Matlab so that I can get rid of the for-loop?

newData=[];
for ind=1:size(timesteps,1)
  newData=cat(2,newData,data(timesteps(ind,1):timesteps(ind,2)));
end

Edit: With the solution from Wolfie I got the following (really good) result. (I only used a small dataset, which is normally 50 times as big.)

(Mine)    Elapsed time is 48.579997 seconds.
(Wolfies) Elapsed time is 0.058733 seconds.

Upvotes: 2

Answers (3)

Wolfie

Reputation: 30047

Irreducible's answer uses str2num and sprintf to flip between numeric and char data to create the indices... This is less performant (in my tests) to just looping as you've already done for small arrays, but faster for large arrays as memory allocation is handled better.

You can increase performance by preallocating your output, and indexing into it to avoid concatenation in a loop. For large arrays, this could give a large speed-up.

N = [0; cumsum( diff( timesteps, [], 2 ) + 1 )];
newData = NaN( 1, max(N) );
for ind = 1:size(timesteps,1)
    newData(N(ind)+1:N(ind+1)) = data(timesteps(ind,1):timesteps(ind,2));
end

The below benchmark shows how this is consistently quicker.

x axis: number of elements in data
y axis: time in seconds
assumption: choosing random subset of indices, where index has 4x fewer rows than data.

Benchmarking plot

Note, this is variable depending on the indices being used. In the below code, I generate the indices randomly each run, so you may see the plot jump around a bit.

However, the loop with preallocation is consistently quicker, and the loop without preallocation consistently blows up exponentially.

Benchmarking code

T = [];
p = 4:12;
for ii = p
    n = 2^ii;
    k = 2^(ii-2);

    timesteps = reshape( sort( randperm( n, k*2 ) ).', 2, [] ).';
    data = 1:n;

    f_Playergod = @() f1(timesteps, data);
    f_Irreducible = @() f2(timesteps, data);
    f_Wolfie = @() f3(timesteps, data);

    T = [T; [timeit( f_Playergod ), timeit( f_Irreducible ), timeit( f_Wolfie )]];
end

figure(1); clf; 
plot( T, 'LineWidth', 1.5 );
legend( {'Loop, no preallocation', 'str2num indexing', 'loop, with preallocation'}, 'location', 'best' );
xticklabels( 2.^p ); grid on;

function newData = f1( timesteps, data )
    newData=[];
    for ind=1:size(timesteps,1)
      newData=cat(2,newData,data(timesteps(ind,1):timesteps(ind,2)));
    end
end
function newData = f2( timesteps, data )
    newData = data( str2num(sprintf('%d:%d ',timesteps')) );
end
function newData = f3( timesteps, data )
    N = [0; cumsum( diff( timesteps, [], 2 ) + 1 )];
    newData = NaN( 1, max(N) );
    for ind = 1:size(timesteps,1)
        newData(N(ind)+1:N(ind+1)) = data(timesteps(ind,1):timesteps(ind,2));
    end
end

Upvotes: 5

user2305193

Reputation: 2059

I would usually go for a loop, but you could do something like this

%take every 1st column element and 2nd column elemeent, use the range of numbers to index data
a=arrayfun(@(x,y) data(x:y),timesteps(:,1),timesteps(:,2),'UniformOutput',0) 
%convert cell array to vector
a=[a{:}]

I should mention this is significantly slower than a loop

Upvotes: 1

Irreducible

Reputation: 899

Just to get rid of the for loop you can do the following:

timesteps = [1,3;5,7;9,10];
data = [1,2,3,4,5,6,7,8,9,10];
%create a index vector of the indices you want to extract
idx=str2num(sprintf('%d:%d ',timesteps'));
%done
res=data(idx)

res =

 1     2     3     5     6     7     9    10

however, regarding run time, as stated in the comments, I have not tested it but I doubt that it will be faster. The only advantage here is that the result array has not to be update with each iteration ...

Upvotes: 2

Index an array with a set of start and end indices

Answers (3)

Related Questions