Michiel
Michiel

Reputation: 189

All value combinations for a fixed set of parameter combinations

The problem

I am looking for a way to create a list of all possible combinations of a set of parameter with their values, where each parameter exists exactly once in the output. An example input set would look like this:

sampleData = data.frame(Parameter= c("A","B","B","C","C","C","D","D"),
                           Value = c(1,0.9,1,0.8,1,1.2,0.8,1.1))
  Parameter Value
1         A   1.0
2         B   0.9
3         B   1.0
4         C   0.8
5         C   1.0
6         C   1.2
7         D   0.8
8         D   1.1

The desired output is a list of all unique ABCD combinations so the first two elements of the list are e.g.

[[1]]
      Parameter Value
1         A   1.0
2         B   0.9
3         C   0.8
4         D   0.8
[[2]]
      Parameter Value
1         A   1.0
2         B   1.0
3         C   0.8
4         D   0.8

My attempt so far I have looked into the combinations function in the gtools package and the following does something close to what I want

combinations(n = nrow(sampleData),
             r = length(unique(sampleData$Parameter)),
             v = paste0(sampleData$Parameter,"_",sampleData$Value))

with some postprocessing I will be able to get the desired result out.

But combinations also yields results like

      Parameter Value
1         A   1.0
2         B   0.9
3         B   1.0
4         D   0.8

i.e. with one (or more) parameter occuring multiple times.

I will be able to post-process this out, but the output of combinations quickly grows large (already 70 for this example, growing as n and r!) while the desired output list grows much less rapidly (12 in this example, adn growing much slower).

So my question is: Is there a (relatively) efficient way to generate the desired output of ABCD value combinations without first generating a much larger set and then stripping invalid combinations out?

Upvotes: 1

Views: 766

Answers (1)

Z.Lin
Z.Lin

Reputation: 29125

I think you're looking for expand.grid().

The following will return a data frame with each unique combination in a row:

library(dplyr)

df <- sampleData %>%
  split(.$Parameter) %>%              # create a dataframe of values for each parameter
  lapply(function(df){df$Value}) %>%  # extract the values for each parameter as an array
  expand.grid()                       # generate all combinations

> df
   A   B   C   D
1  1 0.9 0.8 0.8
2  1 1.0 0.8 0.8
3  1 0.9 1.0 0.8
4  1 1.0 1.0 0.8
5  1 0.9 1.2 0.8
6  1 1.0 1.2 0.8
7  1 0.9 0.8 1.1
8  1 1.0 0.8 1.1
9  1 0.9 1.0 1.1
10 1 1.0 1.0 1.1
11 1 0.9 1.2 1.1
12 1 1.0 1.2 1.1

And if you want the results converted into a list of data frames:

library(tidyr)

df %>%
  mutate(combination = row_number()) %>%
  gather(Parameter, Value, -combination) %>%
  split(.$combination) %>%
  lapply(function(d){d[,-1]})

$`1`
   Parameter Value
1          A   1.0
13         B   0.9
25         C   0.8
37         D   0.8

$`2`
   Parameter Value
2          A   1.0
14         B   1.0
26         C   0.8
38         D   0.8
...

Upvotes: 4

Related Questions