Emily
Emily

Reputation: 315

Combinations by group in R

I have a question about combinations by group.

My mini-sample looks like this:

sample <- data.frame(
  group=c("a","a","a","a","b","b","b"),
  number=c(1,2,3,2,4,5,3)
)

If I apply the function of combnto the data frame,it gives me following result, which is all the combinations of the values under the 'number' column regardless of which group the value belongs to:

         [,1] [,2]
   [1,]    1    2
   [2,]    1    3
   [3,]    1    2
   [4,]    1    4
   [5,]    1    5
   [6,]    1    3
   [7,]    2    3
   [8,]    2    2
   [9,]    2    4
  [10,]    2    5
  [11,]    2    3
  [12,]    3    2
  [13,]    3    4
  [14,]    3    5
  [15,]    3    3
  [16,]    2    4
  [17,]    2    5
  [18,]    2    3
  [19,]    4    5
  [20,]    4    3
  [21,]    5    3

The code that I used for the results above is as follows:

t(combn((sample$number), 2))

However, I would like to get the combination results within the group (i.e., "a", "b"). Therefore, the result that I want to get should look like this:

     [,1] [,2] [,3]
[1,]   a    1    2
[2,]   a    1    3
[3,]   a    1    2
[4,]   a    2    3
[5,]   a    2    2
[6,]   a    3    2
[7,]   b    4    5
[8,]   b    4    3
[9,]   b    5    3

In addition to the combinations, I would like to get the column indicating the group.

Upvotes: 8

Views: 3990

Answers (2)

talat
talat

Reputation: 70256

Here's a base R option using (1) split to create a list of data.frames per unique group-entry, (2) lapply to loop over each list element and compute the combinations using combn, (3) do.call(rbind, ...) to collect the list elements back into a single data.frame.

do.call(rbind, lapply(split(sample, sample$group), {
   function(x) data.frame(group = x$group[1], t(combn(x$number, 2)))
}))

#    group X1 X2
#a.1     a  1  2
#a.2     a  1  3
#a.3     a  1  2
#a.4     a  2  3
#a.5     a  2  2
#a.6     a  3  2
#b.1     b  4  5
#b.2     b  4  3
#b.3     b  5  3

And a dplyr option:

library(dplyr)
sample %>% group_by(group) %>% do(data.frame(t(combn(.$number, 2))))
#Source: local data frame [9 x 3]
#Groups: group [2]
#
#   group    X1    X2
#  (fctr) (dbl) (dbl)
#1      a     1     2
#2      a     1     3
#3      a     1     2
#4      a     2     3
#5      a     2     2
#6      a     3     2
#7      b     4     5
#8      b     4     3
#9      b     5     3

Upvotes: 4

akrun
akrun

Reputation: 886938

We can use a group by function with data.table

library(data.table)
setDT(sample)[, {i1 <-  combn(number, 2)
                   list(i1[1,], i1[2,]) }, by =  group]
#    group V1 V2
#1:     a  1  2
#2:     a  1  3
#3:     a  1  2
#4:     a  2  3
#5:     a  2  2
#6:     a  3  2
#7:     b  4  5
#8:     b  4  3
#9:     b  5  3

Or a compact option would be

setDT(sample)[, transpose(combn(number, 2, FUN = list)), by = group]

Or using base R

 lst <- by(sample$number, sample$group, FUN = combn, m= 2)
 data.frame(group = rep(unique(as.character(sample$group)), 
                        sapply(lst, ncol)), t(do.call(cbind, lst)))

Upvotes: 4

Related Questions