DeltaIV
DeltaIV

Reputation: 5646

How can I create an arbitrary number of columns of random vectors?

I often need to write something like

sample_size = 10^4
my_data <- data.frame(x1 = runif(sample_size, 0,3), x2 = runif(sample_size, 0,3), x3 = runif(sample_size, 0,3), x4 = runif(sample_size, 0,3))

in order to test some statistical models. For example,

error <- rnorm(sample_size, 0, 0.1)
y <- with( my_data, 2*x1+0.1*(x2 + x3 + x4)) + error
my_model <- lm(y ~ ., data = my_data)

Since my_data is used as input to lm, it has to be a data frame (or a list).

I wonder if invoking runif 4 times is the right way to do this, or if there are better solutions. I tried

my_data <- matrix(4*runif(sample_size, 0,3), sample_size, 4, dimnames = list(NULL, paste0("x", 1:4)))
my_data <- as.data.frame(my_data)

But it doesn't seem so readable to me.

Upvotes: 0

Views: 69

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

There are a few ways to do this. Let's say you want ncol columns, here are some good ways:

ncol = 4
sample_size = 10

replicate(ncol, runif(sample_size, 0, 3))
matrix(runif(sample_size * ncol, 0, 3), ncol = ncol)
sapply(1:ncol, function(x) runif(sample_size, 0, 3))

These create matrices which you can, of course, convert to data frames as needed. The differences are minor. replicate is essentially a nice wrapper for sapply. The direct matrix method may be slightly faster, but probably the difference is a few milliseconds.

Upvotes: 1

Related Questions