Reputation: 125
Assuming i have a series of column-vectors with different length, what would be the best way, in terms of computation time, to join all of them into one matrix where the size of it is determined by the longest column and the elongated columns cells are all filled with NaN's.
Edit: Please note that I am trying to avoid cell arrays, since they are expensive in terms of memory and run time.
For example:
A = [1;2;3;4];
B = [5;6];
C = magicFunction(A,B);
Result:
C =
1 5
2 6
3 NaN
4 NaN
Upvotes: 1
Views: 173
Reputation: 221704
The following code avoids use of cell arrays
except for the estimation of number of elements in each vector and this keeps the code a bit cleaner. The price for using cell arrays
for that tiny bit of work shouldn't be too expensive. Also, varargin
gets you the inputs as a cell array anyway. Now, you can avoid cell arrays there too, but it would most probably involve use of for-loops
and might have to use variable names for each of the inputs, which isn't too elegant when creating a function with unknown number of inputs. Otherwise, the code uses numeric arrays
, logical indexing
and my favourite bsxfun
, which must be cheap in the market of runtimes
.
Function Code
function out = magicFunction(varargin)
lens = cellfun(@(x) numel(x),varargin);
out = NaN(max(lens),numel(lens));
out(bsxfun(@le,[1:max(lens)]',lens)) = vertcat(varargin{:}); %//'
return;
Example
Script -
A1 = [9;2;7;8];
A2 = [1;5];
A3 = [2;6;3];
out = magicFunction(A1,A2,A3)
Output -
out =
9 1 2
2 5 6
7 NaN 3
8 NaN NaN
Benchmarking
As part of the benchmarking, we are comparing our solution to @gnovice's solution that was mostly based on using cell arrays. Our intention here to see that after avoiding cell arrays, what speedups we are getting if there's any. Here's the benchmarking code with 20
vectors -
%// Let's create row vectors A1,A2,A3.. to be used with @gnovice's solution
num_vectors = 20;
max_vector_length = 1500000;
vector_lengths = randi(max_vector_length,num_vectors,1);
vs =arrayfun(@(x) randi(9,1,vector_lengths(x)),1:numel(vector_lengths),'uni',0);
[A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20] = vs{:};
%// Maximally cell-array based approach used in linked @gnovice's solution
disp('--------------------- With @gnovice''s approach')
tic
tcell = {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20};
maxSize = max(cellfun(@numel,tcell)); %# Get the maximum vector size
fcn = @(x) [x nan(1,maxSize-numel(x))]; %# Create an anonymous function
rmat = cellfun(fcn,tcell,'UniformOutput',false); %# Pad each cell with NaNs
rmat = vertcat(rmat{:});
toc, clear tcell maxSize fcn rmat
%// Transpose each of the input vectors to get column vectors as needed
%// for our problem
vs = cellfun(@(x) x',vs,'uni',0); %//'
[A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20] = vs{:};
%// Our solution
disp('--------------------- With our new approach')
tic
out = magicFunction(A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,...
A11,A12,A13,A14,A15,A16,A17,A18,A19,A20);
toc
Results -
--------------------- With @gnovice's approach
Elapsed time is 1.511669 seconds.
--------------------- With our new approach
Elapsed time is 0.671604 seconds.
Conclusions -
20
vectors and with a maximum length of 1500000
, the speedups are between 2-3x
and it was seen that the speedups have increased as we have increased the number of vectors. The results to prove that are not shown here to save space, as we have already used quite a lot of it here.Upvotes: 1
Reputation: 1164
If you use a cell matrix you won't need them to be filled with NaNs, just write each array into one column and the unused elements stay empty (that would be the space efficient way). You could either use:
cell_result{1} = A;
cell_result{2} = B;
THis would result in a size 2 cell array which contains all elements of A,B in his elements. Or if you want them to be saved as columns:
cell_result(1,1:numel(A)) = num2cell(A);
cell_result(2,1:numel(B)) = num2cell(B);
If you need them to be filled with NaN's for future coding, it would be the easiest to find the maximum length you got. Create yourself a matrix of (max_length X Number of arrays).
So lets say you have n=5 arrays:A,B,C,D and E.
h=zeros(1,n);
h(1)=numel(A);
h(2)=numel(B);
h(3)=numel(C);
h(4)=numel(D);
h(5)=numel(E);
max_No_Entries=max(h);
result= zeros(max_No_Entries,n);
result(:,:)=NaN;
result(1:numel(A),1)=A;
result(1:numel(B),2)=B;
result(1:numel(C),3)=C;
result(1:numel(D),4)=D;
result(1:numel(E),5)=E;
Upvotes: 0