Reputation: 119
I have two arrays:
timesteps = [1,3;5,7;9,10];
data = [1,2,3,4,5,6,7,8,9,10];
The values in the timesteps
array are describing which values of data
I want. The first column where it starts, and the second where it ends.
e.g. here I want to get [1,2,3,5,6,7,9,10]
.
So this code works fine for me, but its very slow because of the for loop... Is there a one liner in Matlab so that I can get rid of the for-loop?
newData=[];
for ind=1:size(timesteps,1)
newData=cat(2,newData,data(timesteps(ind,1):timesteps(ind,2)));
end
Edit: With the solution from Wolfie I got the following (really good) result. (I only used a small dataset, which is normally 50 times as big.)
(Mine) Elapsed time is 48.579997 seconds.
(Wolfies) Elapsed time is 0.058733 seconds.
Upvotes: 2
Views: 443
Reputation: 30047
Irreducible's answer uses str2num
and sprintf
to flip between numeric and char data to create the indices... This is less performant (in my tests) to just looping as you've already done for small arrays, but faster for large arrays as memory allocation is handled better.
You can increase performance by preallocating your output, and indexing into it to avoid concatenation in a loop. For large arrays, this could give a large speed-up.
N = [0; cumsum( diff( timesteps, [], 2 ) + 1 )];
newData = NaN( 1, max(N) );
for ind = 1:size(timesteps,1)
newData(N(ind)+1:N(ind+1)) = data(timesteps(ind,1):timesteps(ind,2));
end
The below benchmark shows how this is consistently quicker.
data
index
has 4x fewer rows than data
.Benchmarking plot
Note, this is variable depending on the indices being used. In the below code, I generate the indices randomly each run, so you may see the plot jump around a bit.
However, the loop with preallocation is consistently quicker, and the loop without preallocation consistently blows up exponentially.
Benchmarking code
T = [];
p = 4:12;
for ii = p
n = 2^ii;
k = 2^(ii-2);
timesteps = reshape( sort( randperm( n, k*2 ) ).', 2, [] ).';
data = 1:n;
f_Playergod = @() f1(timesteps, data);
f_Irreducible = @() f2(timesteps, data);
f_Wolfie = @() f3(timesteps, data);
T = [T; [timeit( f_Playergod ), timeit( f_Irreducible ), timeit( f_Wolfie )]];
end
figure(1); clf;
plot( T, 'LineWidth', 1.5 );
legend( {'Loop, no preallocation', 'str2num indexing', 'loop, with preallocation'}, 'location', 'best' );
xticklabels( 2.^p ); grid on;
function newData = f1( timesteps, data )
newData=[];
for ind=1:size(timesteps,1)
newData=cat(2,newData,data(timesteps(ind,1):timesteps(ind,2)));
end
end
function newData = f2( timesteps, data )
newData = data( str2num(sprintf('%d:%d ',timesteps')) );
end
function newData = f3( timesteps, data )
N = [0; cumsum( diff( timesteps, [], 2 ) + 1 )];
newData = NaN( 1, max(N) );
for ind = 1:size(timesteps,1)
newData(N(ind)+1:N(ind+1)) = data(timesteps(ind,1):timesteps(ind,2));
end
end
Upvotes: 5
Reputation: 2059
I would usually go for a loop, but you could do something like this
%take every 1st column element and 2nd column elemeent, use the range of numbers to index data
a=arrayfun(@(x,y) data(x:y),timesteps(:,1),timesteps(:,2),'UniformOutput',0)
%convert cell array to vector
a=[a{:}]
I should mention this is significantly slower than a loop
Upvotes: 1
Reputation: 899
Just to get rid of the for loop you can do the following:
timesteps = [1,3;5,7;9,10];
data = [1,2,3,4,5,6,7,8,9,10];
%create a index vector of the indices you want to extract
idx=str2num(sprintf('%d:%d ',timesteps'));
%done
res=data(idx)
res =
1 2 3 5 6 7 9 10
however, regarding run time, as stated in the comments, I have not tested it but I doubt that it will be faster. The only advantage here is that the result array has not to be update with each iteration ...
Upvotes: 2