Reputation: 2964
I wish to generate the unique sequences of elements in a list where some elements are not unique in R
sequence <- c(1,0,1,0)
e.g:
result<-function(sequence)
result:
seq1 seq2 seq3 seq4 seq5 seq6
1 1 1 0 0 0 1
2 0 1 0 1 1 0
3 1 0 1 0 1 0
4 0 0 1 1 0 1
notice that all sequences contain every element from the original sequence, such that the sum of the sequence is always 2
gtools returns "too few different elements"
result <- gtools::permutations(4, 4, coseq)
I am not finding any SO post that directly solve this, but instead allow element repeats:Creating combination of sequences
achievable with expand.grid
and different lengths of sequences.
EDIT: The above is a minimal example, ideally it would work on the sequence:
sequence = c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)
It is somewhat important that the solution does not generate duplicates that are then subsequently removed, since a longer sequence, say 20 or 30 will be very computationally demanding if duplicates are generated.
Upvotes: 2
Views: 258
Reputation: 7608
There are a couple of packages specifically built for this.
First the arrangements
package:
## sequence is a bad name as it is a base R function so we use s instead
s <- c(1,0,1,0)
arrangements::permutations(unique(s), length(s), freq = table(s))
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 1 0 1 0
[3,] 1 0 0 1
[4,] 0 1 1 0
[5,] 0 1 0 1
[6,] 0 0 1 1
Next, we have RcppAlgos
(I am the author):
RcppAlgos::permuteGeneral(unique(s), length(s), freqs = table(s))
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 1 0 1 0
[3,] 1 0 0 1
[4,] 0 1 1 0
[5,] 0 1 0 1
[6,] 0 0 1 1
They are both very efficient as well. To give you an idea, for the actual need by the OP, the other methods will fail (I think there is a limit on the number of rows for a matrix ... 2^31 - 1, not sure though) or take a very long time as they will have to generate 16! ~= 2.092e+13
permutations before any further processing. However, with these two packages, the return is instant:
## actual example needed by OP
sBig <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)
system.time(a <- arrangements::permutations(unique(sBig), length(sBig), freq = table(sBig)))
user system elapsed
0.001 0.001 0.002
system.time(b <- RcppAlgos::permuteGeneral(unique(sBig), length(sBig), freqs = table(sBig)))
user system elapsed
0.001 0.001 0.002
identical(a, b)
[1] TRUE
dim(a)
[1] 11440 16
Upvotes: 3
Reputation: 32558
m = apply(gtools::permutations(2, 4, 1:4, repeats.allowed = TRUE), 1, function(x) sequence[x])
m[,colSums(m) == 2]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 1 0 0 0
#[2,] 1 0 0 1 1 0
#[3,] 0 1 0 1 0 1
#[4,] 0 0 1 0 1 1
Upvotes: 3
Reputation: 50738
Since you mentioned gtools::permutations
, you could do this
First generate all permutations
m <- apply(permutations(4, 4, 1:length(sequence)), 1, function(x) sequence[x])
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,] 1 1 1 1 1 1 0 0 0 0 0 0 1 1
#[2,] 0 0 1 1 0 0 1 1 1 1 0 0 1 1
#[3,] 1 0 0 0 0 1 1 0 1 0 1 1 0 0
#[4,] 0 1 0 0 1 0 0 1 0 1 1 1 0 0
# [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
#[1,] 1 1 1 1 0 0 0 0 0 0
#[2,] 0 0 0 0 1 1 0 0 1 1
#[3,] 1 0 1 0 0 1 1 1 1 0
#[4,] 0 1 0 1 1 0 1 1 0 1
Then remove duplicate columns (from the indistinguishability of the 1's and 0's)
m[, !duplicated(apply(m, 2, paste, collapse = ""))]
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 1 0 0 0
#[2,] 0 0 1 1 1 0
#[3,] 1 0 0 1 0 1
#[4,] 0 1 0 0 1 1
Upvotes: 2