sebastian-c
sebastian-c

Reputation: 15395

Given set of column values, create data.frame with known number of rows

I'm trying to make datasets of a fixed number of rows to make test datasets - however I'm writing to a destination that requires known keys for each column. For my example, assume that these keys are lowercase letters, upper case letters and numbers respectively.

I need to make a function which, provided only the required number of rows, combines keys such that the number of combinations is equal the required number. Naturally there will be some impossible cases such as prime numbers than the largest key and values larger than the product of the number of keys.

A sample output dataset of 10 rows could look like the following:

data.frame(col1 = rep("a", 10),
           col2 = rep(LETTERS[1:5], 2),
           col3 = rep(1:2, 5))

   col1 col2 col3
1     a    A    1
2     a    B    2
3     a    C    1
4     a    D    2
5     a    E    1
6     a    A    2
7     a    B    1
8     a    C    2
9     a    D    1
10    a    E    2

Note here that I had to manually specify the keys to get the desired number of rows. How can I arrange things so that R can do this for me?

Things I've already considered

Upvotes: 0

Views: 115

Answers (2)

apitsch
apitsch

Reputation: 1702

For integer optimisation on a low level scale you can use a grid search. Other possibilities are described here.

This should work for your example.

N <- 10
fr <- function(x) { 
  x1 <- x[1]
  x2 <- x[2]
  x3 <- x[3]
  (x1 * x2 * x3 - N)^2
}
library(NMOF)
gridSearch(fr, list(seq(0,5), seq(0,5), seq(0,5)))$minlevels

Upvotes: 1

amonk
amonk

Reputation: 1795

I am a bit reluctant,but we can work things out:

  a1<-2
  a2<-5

  eval(parse(text=paste0("data.frame(col1 = rep(LETTERS[1],",a1*a2,"),col2 = 
  rep(LETTERS[1:",a2,"],",a1,"),col3 = rep(1:",a1,",",a2,"))")))

    col1 col2 col3
1     A    A    1
2     A    B    2
3     A    C    1
4     A    D    2
5     A    E    1
6     A    A    2
7     A    B    1
8     A    C    2
9     A    D    1
10    A    E    2

Is this something similar to what you are asking?

Upvotes: 0

Related Questions