Emm
Emm

Reputation: 113

How to generate random uniform distribution down a column of values in a dataframe without having to repeat for every value in said column?

I need to create generate random variables based on a uniform distribution using the runif command. I have a column of values that the max and min will be based off. I was wondering if there is a way to apply the runif down the column without having to repeat the process for each value in the column of values? No others can be used. We're working with rnorm and runif, etc. so no other stats stuff allowed.

For example:

set.seed(1234)
values <- (30, 45, 80, 90, 80)
var_1 <- runif(5, 30*(.5), 30*(1.25))
var_2 <- runif(5, 45*(.5), 45*(1.25))
var_3 <- runif(5, 80*(.5), 80*(1.25))
var_4 <- runif(5, 90*(.5), 90*(1.25))
var_5 <- runif(5, 80*(.5), 80*(1.25))

This is basically what I'd have to do but only it's a larger data frame than just five observations. I also have to generate many more random numbers than just 5. I was hoping there was a way to expedite that process so I didn't need to repeat the var_3 <- runif part for every row in my datagram. If it helps, I can turn the column of the dataframe into a matrix with a single column and multiple rows. Eventually, I'll be sampling from these randomly generated numbers to perform a Monte Carlo simulation.

I'm assuming some apply function would work, but I'm still not certain how anything from the apply family works. I have looked into some already posted answer but the answers were a bit over my head and I couldn't change them to help me as I had initially thought I could.

Upvotes: 1

Views: 1700

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

Here is another option using inverse probability integral transform:

set.seed(1234)
values <- c(30, 45, 80, 90, 80)
n <- length(values)
m <- 10L
t(values * t((1.25 - 0.5) * matrix(runif(m*n), m, n) + 0.5))

CDF of OP's distribution is F(x) = 1 / (1.25 - 0.5) * ( x - 0.5 ). Hence, F^{-1}(u) = (1.25 - 0.5) * u + 0.5.

We generate standard uniform random variables and transform into the desired distribution using inverse PIT using this F^{-1}(u).

The two t operations are i) for easy scaling by values and ii) to ensure that the output is in the same format.

Reference:

  1. Inverse transform sampling, Wikipedia, https://en.wikipedia.org/wiki/Inverse_transform_sampling

Upvotes: 0

Ahorn
Ahorn

Reputation: 3876

Like this:

set.seed(1234)
values <- c(30, 45, 80, 90, 80)

mat <- sapply(values, function(x) runif(5, x*(.5), x*(1.25)))

colnames(mat) <- values

mat

> mat
           30       45       80        90       80
[1,] 15.79778 33.49176 82.79809 106.63342 84.65663
[2,] 27.71421 27.73334 46.04614 108.84509 94.95845
[3,] 21.30580 26.88622 97.01830  63.84305 99.67589
[4,] 19.59442 37.19917 47.30907  53.33430 96.54164
[5,] 18.00913 23.80419 53.17940  98.80833 69.16812

Upvotes: 2

Related Questions