MathUser
MathUser

Reputation: 163

Multiple sampling with different sizes on Matlab

I am trying to implement this code so it works as quickly as possible.

Say I have a population of 100 different values, you can think of it as pop = 1:100 or pop = randn(1,100) to keep things simple. I have a vector n which gives me the size of samples I want to get. Say, for example, that n=[1 3 10 6 2]. What I want to do is to take 5 (which in reality is length(n)) different samples of pop, each consisting of n(i) elements without replacement. This means that for my first sample I want 1 element out of pop, for the second sample I want 3, for the third I want 10, and so on.

To be honest, I am not really interested in which elements are sampled. What I want to get is the sum of those elements that are present in the ith-sample. This would be trivial if I implemented it with a loop, but I am trying to avoid using them to keep my code as quick as possible. I have to do this for many different populations and with length(n)being very large.

If I had to do it with a loop, this would be how:

pop = randn(1,100);
n = [1 3 10 6 2];
sum_sample = zeros(length(n),1);
for i = 1:length(n)
  sum_sample(i,1) = sum(randsample(pop,n(i)));
end

Is there a way to do this?

Upvotes: 2

Views: 148

Answers (3)

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21563

The only way to figure out what is fastest for you is to do a comparison of the different methods.

In fact the loop appears to be very fast in this case!

pop = randn(1,100);
n = [1 3 10 6 2];

tic
sr = @(n) sum(randsample(pop,n));
sum_sample = arrayfun(sr,n);
toc %% Returns about 0.004

clear su
tic
for t=numel(n):-1:1
    su(t)=sum(randsample(pop,n(t)));
end
toc %% Returns about 0.003

Upvotes: 1

NKN
NKN

Reputation: 6414

You can do something like this:

pop = randn(1,100);
n = [1 3 10 6 2];
sampled_data_index = randi(length(pop),1,sum(n));
sampled_data = pop(sampled_data_index);

The randi function randomly selects integer values in a specified range that is suitable for indexing. After you have the indices you can use those at once to sample the data from the pop database.

If you want to have unique indices you can replace the randi function with randperm:

sampled_data_index = randperm(length(pop),sum(n));

Finally:

You can have all the sampled values as a cell variable using the following code:

pop = randn(1,100);
n = [1 3 10 6 2];
fun = @(m) pop(randperm(length(pop),m));
C = arrayfun(fun,n,'UniformOutput',0)

Also having the sum of the sampled data:

funs = @(m) sum(pop(randperm(length(pop),m)));
sumC = arrayfun(funs,n)

Upvotes: 0

hbaderts
hbaderts

Reputation: 14316

You can create a function handle which choses the random samples and sums these up. Then you can use arrayfun to execute this function for all values of n:

pop = randn(1,100);
n = [1 3 10 6 2];
sr = @(n) sum(randsample(pop,n));
sum_sample = arrayfun(sr,n);

Upvotes: 0

Related Questions