wolfsatthedoor
wolfsatthedoor

Reputation: 7303

R Using distributions to generate random variables, but with many different dimensions and parameter values

Ok, I apologize if this question is asked elsehwere. I am not getting this though.

I want to generate S Dirichlet draws (from 'MCMCpack') for EACH set of parameter values I have. For example,

S=10 
param1=c(1,1,1)
param2=c(1,1,1000)

rdirchlet(2,c(param1,param2))

I want the last line to return something like

1/3      1/3     1/3 
1/1002 1/1002 1000/1002

(It will in expectation.)

Obviously the last line does not work, since it interprets it as somehow a 6 parameter thing. Can someone help me figure this out once and for all and I will delete this question if it is a stupid one?

Upvotes: 1

Views: 1035

Answers (1)

jlhoward
jlhoward

Reputation: 59355

I think you may be misunderstanding what rdirichlet(...) does (BTW: you do have to spell it correctly...).

rdirichlet(n,alpha) returns a matrix with n rows, and length(alpha) columns. Each row corresponds to a random deviate taken from the gamma distribution with scale parameter given by the corresponding element of alpha, normalized so that the row-wise sums are 1. So, for example,

set.seed(1)
rdirichlet(2,c(1,1,1))
#            [,1]      [,2]      [,3]
# [1,] 0.04037978 0.4899465 0.4696737
# [2,] 0.25991848 0.3800170 0.3600646

Two rows because n=2, 3 columns because length(alpha)=3. There is no reason to expect that the values in the three columns will be equal (to 1/3) just because alpha = c(1,1,1), although the column-wise means will approach (1/3,1/3,1/3) for large n:

set.seed(1)
colMeans(rdirichlet(1000,c(1,1,1)))
# [1] 0.3371990 0.3314027 0.3313983

Given this, it is not clear (to me at least) what you want exactly. This will create a list of matrices:

set.seed(1)
lapply(list(param1,param2),function(x)rdirichlet(2,x))
# [[1]]
#            [,1]      [,2]      [,3]
# [1,] 0.04037978 0.4899465 0.4696737
# [2,] 0.25991848 0.3800170 0.3600646

# [[2]]
#              [,1]         [,2]      [,3]
# [1,] 0.0010146803 0.0003150297 0.9986703
# [2,] 0.0001574301 0.0003112573 0.9995313

Something that looks more or less like your expected output can be generated this way:

set.seed(1)
t(apply(rbind(param1,param2),1,function(x)colMeans(rdirichlet(S,x))))
#                [,1]        [,2]      [,3]
# param1 0.3765401986 0.369370923 0.2540889
# param2 0.0005991643 0.001380334 0.9980205

Finally, the univariate distributions work differently. rnorm(...), runif(...) etc return a vector (not a matrix), so the apply(...) functions can be used more or less directly:

param1 <- c(0,1)
param2 <- c(5,2)
param3 <- c(1,.2)
set.seed(1)
sapply(list(param1,param2,param3),function(x)rnorm(5,mean=x[1],sd=x[2]))
#           [,1]     [,2]      [,3]
# [1,] -0.6264538 3.359063 1.3023562
# [2,]  0.1836433 5.974858 1.0779686
# [3,] -0.8356286 6.476649 0.8757519
# [4,]  1.5952808 6.151563 0.5570600
# [5,]  0.3295078 4.389223 1.2249862

Here, each column is a vector of random variates from the corresponding parameter-set.

Upvotes: 2

Related Questions