Reputation: 315
I have a question about combinations by group.
My mini-sample looks like this:
sample <- data.frame(
group=c("a","a","a","a","b","b","b"),
number=c(1,2,3,2,4,5,3)
)
If I apply the function of combn
to the data frame,it gives me following result, which is all the combinations of the values under the 'number' column regardless of which group the value belongs to:
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 2
[4,] 1 4
[5,] 1 5
[6,] 1 3
[7,] 2 3
[8,] 2 2
[9,] 2 4
[10,] 2 5
[11,] 2 3
[12,] 3 2
[13,] 3 4
[14,] 3 5
[15,] 3 3
[16,] 2 4
[17,] 2 5
[18,] 2 3
[19,] 4 5
[20,] 4 3
[21,] 5 3
The code that I used for the results above is as follows:
t(combn((sample$number), 2))
However, I would like to get the combination results within the group (i.e., "a", "b"). Therefore, the result that I want to get should look like this:
[,1] [,2] [,3]
[1,] a 1 2
[2,] a 1 3
[3,] a 1 2
[4,] a 2 3
[5,] a 2 2
[6,] a 3 2
[7,] b 4 5
[8,] b 4 3
[9,] b 5 3
In addition to the combinations, I would like to get the column indicating the group.
Upvotes: 8
Views: 3990
Reputation: 70256
Here's a base R option using (1) split
to create a list of data.frames per unique group-entry, (2) lapply
to loop over each list element and compute the combinations using combn
, (3) do.call(rbind, ...)
to collect the list elements back into a single data.frame
.
do.call(rbind, lapply(split(sample, sample$group), {
function(x) data.frame(group = x$group[1], t(combn(x$number, 2)))
}))
# group X1 X2
#a.1 a 1 2
#a.2 a 1 3
#a.3 a 1 2
#a.4 a 2 3
#a.5 a 2 2
#a.6 a 3 2
#b.1 b 4 5
#b.2 b 4 3
#b.3 b 5 3
And a dplyr option:
library(dplyr)
sample %>% group_by(group) %>% do(data.frame(t(combn(.$number, 2))))
#Source: local data frame [9 x 3]
#Groups: group [2]
#
# group X1 X2
# (fctr) (dbl) (dbl)
#1 a 1 2
#2 a 1 3
#3 a 1 2
#4 a 2 3
#5 a 2 2
#6 a 3 2
#7 b 4 5
#8 b 4 3
#9 b 5 3
Upvotes: 4
Reputation: 886938
We can use a group by function with data.table
library(data.table)
setDT(sample)[, {i1 <- combn(number, 2)
list(i1[1,], i1[2,]) }, by = group]
# group V1 V2
#1: a 1 2
#2: a 1 3
#3: a 1 2
#4: a 2 3
#5: a 2 2
#6: a 3 2
#7: b 4 5
#8: b 4 3
#9: b 5 3
Or a compact option would be
setDT(sample)[, transpose(combn(number, 2, FUN = list)), by = group]
Or using base R
lst <- by(sample$number, sample$group, FUN = combn, m= 2)
data.frame(group = rep(unique(as.character(sample$group)),
sapply(lst, ncol)), t(do.call(cbind, lst)))
Upvotes: 4