jj367
jj367

Reputation: 25

R - Select unique combinations of values from within the same column

Assuming a data frame in the following structure:

ColA  ColB
A     1
A     2
A     4
B     3
B     2
B     4
C     1
C     1
C     1

Is it possible to select all combinations of A, B and C that sum to 7 (using each of A, B & C once), for example as follows:

ColA  Combination1  Combination2
A     2             4
B     4             2
C     1             1

Alternatively, would it be better to loop through all unique combinations of A, B & C to determine their total and then column bind them? If so, how would I loop through the unique combinations of A, B & C assuming data in the structure of the first data frame?

Upvotes: 1

Views: 275

Answers (1)

mra68
mra68

Reputation: 2960

Here is my suggestion:

df <- read.table( header =TRUE,
                  text = "ColA  ColB
                          A     1
                          A     2
                          A     4
                          B     3
                          B     2
                          B     4
                          C     1
                          C     1
                          C     1")

X <- tapply(df$ColB, df$ColA, unique)
G <- expand.grid(X)
G <- G[rowSums(G)==7, ]

Combinations <- data.frame(t(G))
colnames(Combinations) <- paste0("Comb_", 1:ncol(Combinations))

The list X contains the unique values of A,B, and C:

X
#$A
#[1] 1 2 4

#$B
#[1] 3 2 4

#$C
#[1] 1

Then expand.grid builds the cartesian product G, i.e. the combinations of these unique values:

G
#  A B C
#1 1 3 1
#2 2 3 1
#3 4 3 1
#4 1 2 1
#5 2 2 1
#6 4 2 1
#7 1 4 1
#8 2 4 1
#9 4 4 1

After G <- G[rowSums(G)==7,] only the combinations with a total sum of 7 are left:

G
#  A B C
#6 4 2 1
#8 2 4 1

Finally the data in the matrix G is put into the data.frame Combinations:

Combinations
#  Comb_1 Comb_2
#A      4      2
#B      2      4
#C      1      1

Upvotes: 1

Related Questions