Spherical Cow
Spherical Cow

Reputation: 3

Make vector of elements less than each element of another vector

I have a vector, v, of N positive integers whose values I do not know ahead of time. I would like to construct another vector, a, where the values in this new vector are determined by the values in v according to the following rules: - The elements in a are all integers up to and including the value of each element in v - 0 entries are included only once, but positive integers appear twice in a row

For example, if v is [1,0,2] then a should be: [0,1,1,0,0,1,1,2,2].

Is there a way to do this without just doing a for-loop with lots of if statements?

I've written the code in loop format but would like a vectorized function to handle it.

Upvotes: 0

Views: 94

Answers (3)

Nicky Mattsson
Nicky Mattsson

Reputation: 3052

The classical version of your problem is to create a vector a with the concatenation of 1:n(i) where n(i) is the ith entry in a vector b, e.g.

b = [1,4,2];

gives a vector a

a = [1,1,2,3,4,1,2];

This problem is solved using cumsum on a vector ones(1,sum(b)) but resetting the sum at the points 1+cumsum(b(1:end-1)) corresponding to where the next sequence starts.

To solve your specific problem, we can do something similar. As you need two entries per step, we use a vector 0.5 * ones(1,sum(b*2+1)) together with floor. As you in addition only want the entry 0 to occur once, we will just have to start each sequence at 0.5 instead of at 0 (which would yield floor([0,0.5,...]) = [0,0,...]).

So in total we have something like

% construct the list of 0.5s
a = 0.5*ones(1,sum(b*2+1))

% Reset the sum where a new sequence should start
a(cumsum(b(1:end-1)*2+1)+1) =a(cumsum(b(1:end-1)*2+1)+1)*2 -(b(1:end-1)+1)

% Cumulate it and find the floor
a = floor(cumsum(a)) 

Note that all operations here are vectorised!

Benchmark:

You can do a benchmark using the following code

function SO()
b =randi([0,100],[1,1000]);

t1 = timeit(@() Nicky(b));
t2 = timeit(@() Recursive(b));
t3 = timeit(@() oneliner(b));

if all(Nicky(b) == Recursive(b)) && all(Recursive(b) == oneliner(b))
    disp("All methods give the same result")
else
    disp("Something wrong!")
end

disp("Vectorised time: "+t1+"s")
disp("Recursive time: "+t2+"s")
disp("One-Liner time: "+t3+"s")
end

function [a] = Nicky(b)
a = 0.5*ones(1,sum(b*2+1));
a(cumsum(b(1:end-1)*2+1)+1) =a(cumsum(b(1:end-1)*2+1)+1)*2 -(b(1:end-1)+1);
a = floor(cumsum(a));
end

function out=Recursive(arr)
    out=myfun(arr);
    function local_out=myfun(arr)
        if isscalar(arr)
            if arr
                local_out=sort([0,1:arr,1:arr]); % this is faster
            else
                local_out=0;
            end
        else
            local_out=[myfun(arr(1:end-1)),myfun(arr(end))];
        end
    end
end

function b = oneliner(a)
b = cell2mat(arrayfun(@(x)sort([0,1:x,1:x]),a,'UniformOutput',false));
end

Which gives me

All methods give the same result
Vectorised time: 0.00083574s
Recursive time: 0.0074404s
One-Liner time: 0.0099933s

So the vectorised one is indeed the fastest, by a factor approximately 10.

Upvotes: 1

Argyll
Argyll

Reputation: 9875

Is there a way to do this without just doing a for loop with lots of if statements?

Sure. How about recursion? Of course, there is no guarantee that Matlab has tail call optimization.

For example, in a file named filename.m

function out=filename(arr)
    out=myfun(in);
    function local_out=myfun(arr)
        if isscalar(arr)
            if arr
                local_out=sort([0,1:arr,1:arr]); % this is faster
            else
                local_out=0;
            end
        else
            local_out=[myfun(arr(1:end-1)),myfun(arr(end))];
        end
    end
end

in cmd, type

input=[1,0,2];
filename(input);

You can take off the parent function. I added it just hoping Matlab can spot the recursion within filename.m and optimize for it.

would like a vectorized function to handle it.

Sure. Although I don't see the point of vectorizing in such a unique puzzle that is not generalizable to other applications. I also don't foresee a performance boost.

For example, assuming input is 1-by-N. In cmd, type

input=[1,0,2];
cell2mat(arrayfun(@(x)sort([0,1:x,1:x]),input,'UniformOutput',false)

Benchmark

In R2018a

>> clear all
>> in=randi([0,100],[1,100]); N=10000;

>> T=zeros(N,1);tic; for i=1:N; filename(in) ;T(i)=toc;end; mean(T),
ans =
    1.5647

>> T=zeros(N,1);tic; for i=1:N; cell2mat(arrayfun(@(x)sort([0,1:x,1:x]),in,'UniformOutput',false)); T(i)=toc;end; mean(T),
ans =
    3.8699

Ofc, I tested with a few more different inputs. The 'vectorized' method is always about twice as long.

Conclusion: Recursion is faster.

Upvotes: 0

Alex
Alex

Reputation: 124

This can be done with a one-liner using eval:

a = eval(['[' sprintf('sort([0 1:%i 1:%i]) ',[v(:) v(:)]') ']']);

Here is another solution that does not use eval. Not sure what is intended by "vectorized function" but the following code is compact and can be easily made into a function:

a = [];
for i = 1:numel(v)
    a = [a sort([0 1:v(i) 1:v(i)])];
end

Upvotes: 0

Related Questions