Sascha
Sascha

Reputation: 159

How to find all possible combinations of a given set of variables

I have a dataset with 6 variables:

Var1 <- c(1,0,1,0,1)
Var2 <- c(1,0,1,0,1)
Var3 <- c(1,1,1,0,1)
Var4 <- c(1,0,1,1,1)
Var5 <- c(1,0,0,0,1)
Var6 <- c(1,0,1,0,1)

DF <- data.frame(Var1, Var2, Var3, Var4, Var5, Var6)
DF

which results in

    Var1 Var2 Var3 Var4 Var5 Var6
1    1    1    1    1    1    1
2    0    0    1    0    0    0
3    1    1    1    1    0    1
4    0    0    0    1    0    0
5    1    1    1    1    1    1

I want to find all the possible variable-combinations, like how many 2 variable combinations (eg Var1Var2, Var2Var4, Var5Var4, etc...), 3 variable combinations, 4 ... etc. do I have. Is there a way to calculate this?

Thanks.

Upvotes: 0

Views: 223

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101044

Try this

> choose(length(DF), 2:length(DF))
[1] 15 20 15  6  1

or

> lapply(
+   2:length(DF),
+   combn,
+   x = names(DF)
+ )
[[1]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var2" "Var2" "Var2" "Var2" "Var3"
[2,] "Var2" "Var3" "Var4" "Var5" "Var6" "Var3" "Var4" "Var5" "Var6" "Var4"
     [,11]  [,12]  [,13]  [,14]  [,15]
[1,] "Var3" "Var3" "Var4" "Var4" "Var5"
[2,] "Var5" "Var6" "Var5" "Var6" "Var6"

[[2]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1"
[2,] "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4" "Var4" "Var5"
[3,] "Var3" "Var4" "Var5" "Var6" "Var4" "Var5" "Var6" "Var5" "Var6" "Var6"
     [,11]  [,12]  [,13]  [,14]  [,15]  [,16]  [,17]  [,18]  [,19]  [,20]
[1,] "Var2" "Var2" "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4"
[2,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var5" "Var4" "Var4" "Var5" "Var5"
[3,] "Var4" "Var5" "Var6" "Var5" "Var6" "Var6" "Var5" "Var6" "Var6" "Var6"

[[3]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1"
[2,] "Var2" "Var2" "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4"
[3,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var5" "Var4" "Var4" "Var5" "Var5"
[4,] "Var4" "Var5" "Var6" "Var5" "Var6" "Var6" "Var5" "Var6" "Var6" "Var6"
     [,11]  [,12]  [,13]  [,14]  [,15]
[1,] "Var2" "Var2" "Var2" "Var2" "Var3"
[2,] "Var3" "Var3" "Var3" "Var4" "Var4"
[3,] "Var4" "Var4" "Var5" "Var5" "Var5"
[4,] "Var5" "Var6" "Var6" "Var6" "Var6"

[[4]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var2"
[2,] "Var2" "Var2" "Var2" "Var2" "Var3" "Var3"
[3,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var4"
[4,] "Var4" "Var4" "Var5" "Var5" "Var5" "Var5"
[5,] "Var5" "Var6" "Var6" "Var6" "Var6" "Var6"

[[5]]
     [,1]
[1,] "Var1"
[2,] "Var2"
[3,] "Var3"
[4,] "Var4"
[5,] "Var5"
[6,] "Var6"

Upvotes: 1

Philip Schalk
Philip Schalk

Reputation: 61

Well, as in your case all variables are binary, the number of possible combinations given k number of variables is just: enter image description here

To calculate the number of combinations also for non-binary variables, you can use the function expand.grid and then count the number of rows. As you probably don't want to double count combinations, only count unique rows. Here is an easy example:

> library(dplyr)
> var1 <- c(1,2,2,3,5)
> var2 <- c(1,1,1,2,3)
> expand.grid(var1, var2) %>% unique %>% nrow
[1] 12

Upvotes: 0

Related Questions