RAND
RAND

Reputation: 281

Generate permutations in sequential order - R

I previously asked the following question Permutation of n bernoulli random variables in R

The answer to this question works great, as long as n is relatively small (<30), otherwise the following error code occurs Error: cannot allocate vector of size 4.0 Gb. I can get the code to run with somewhat larger values by using my desktop at work but eventually the same error occurs. Even for values that my computer can handle, say 25, the code is extremely slow.

The purpose of this code to is to calculate the difference between the CDF of an exact distribution (hence the permutations) and a normal approximation. I randomly generate some data, calculate the test statistic and then I need to determine the CDF by summing all the permutations that result in a smaller test statistic value divided by the total number of permutations.

My thought is to just generate the list of permutations one at a time, note if it is smaller than my observed value and then go on to the next one, i.e. loop over all possible permutations, but I can't just have a data frame of all the permutations to loop over because that would cause the exact same size and speed issue.

Long story short: I need to generate all possible permutations of 1's and 0's for n bernoulli trials, but I need to do this one at a time such that all of them are generated and none are generated more than once for arbitrary n. For n = 3, 2^3 = 8, I would first generate

000

calculate if my test statistic was greater (1 or 0) then generate

001

calculate again, then generate

010

calculate, then generate

100

calculate, then generate

011

etc until 111

I'm fine with this being a loop over 2^n, that outputs the permutation at each step of the loop but doesn't save them all somewhere. Also I don't care what order they are generated in, the above is just how I would list these out if I was doing it by hand.

In addition if there is anyway to speed up the previous code that would also be helpful.

Upvotes: 1

Views: 408

Answers (1)

Joseph Wood
Joseph Wood

Reputation: 7597

A good solution for your problem is iterators. There is a package called arrangements that is able to generate permutations in an iterative fashion. Observe:

library(arrangements)

# initialize iterator 
iperm <- ipermutations(0:1, 3, replace = T)

for (i in 1:(2^3)) {
    print(iperm$getnext())
}

[1] 0 0 0
[1] 0 0 1
.
.
.
[1] 1 1 1

It is written in C and is very efficient. You can also generate m permutations at a time like so:

iperm$getnext(m)

This allows for better performance because the next permutations are being generated by a for loop in C as opposed to a for loop in R.

If you really need to ramp up performance you can you the parallel package.

iperm <- ipermutations(0:1, 40, replace = T)

parallel::mclapply(1:100, function(x) {
    myPerms <- iperm$getnext(10000)
    # do something
}, mc.cores = parallel::detectCores() - 1)

Note: All code is untested.

Upvotes: 2

Related Questions