Mark Miller
Mark Miller

Reputation: 13123

simulate probability with gamma distribution in SAS versus R

I have some SAS code I am trying to translate into R. At one point the SAS code simulated a probability bhrsim essentially with the following. I am unsure how to translate this into R but show my attempt below. I do not think I have ever tried to simulate probabilities in this fashion before. First the SAS code:

data aa;

seed1    = 11111111 ;
seed2    = 33333333 ;
bhralpha = 2 ;
bhrbeta  = 2 ;

%let rep=50;

do i=1 to &rep;

call rangam(seed1,bhralpha,xgam1);
call rangam(seed2,bhrbeta,xgam2);

bhrsim=xgam1/(xgam1+xgam2);

output;
end;

proc print;
    var seed1 seed2 bhralpha bhrbeta xgam1 xgam2 bhrsim ;
run;

I am a little surprised the call rangam function does not use both the alpha and the beta parmeters at the same time.

Here is the output from the above SAS code formatted in R:

SAS.bhrsim <- c(0.43771, 0.28190, 0.47057, 0.30099, 0.25343,
                0.67331, 0.66465, 0.85924, 0.80344, 0.07354,
                0.16228, 0.92554, 0.52711, 0.56944, 0.10370,
                0.05858, 0.89869, 0.88515, 0.74498, 0.17853,
                0.71263, 0.49174, 0.21546, 0.42798, 0.26264,
                0.52674, 0.41922, 0.71119, 0.92044, 0.69456,
                0.33825, 0.55214, 0.42025, 0.93093, 0.20075,
                0.50655, 0.53586, 0.41479, 0.91006, 0.61604,
                0.71669, 0.72271, 0.91053, 0.73377, 0.50403,
                0.28722, 0.54455, 0.26749, 0.58494, 0.17943)

mean(SAS.bhrsim)
#[1] 0.5226472

Here is my attempt to translate this into R. I am hoping someone can check that my R code is correct and point out any mistakes. In particular I was not expecting to have to use this line: bhrsim <- xgam1 / (xgam1 + xgam2) in R, but that just may be because I have never done this before.

set.seed(1234)

rep <- 50

bhralpha <- 2
bhrbeta  <- 2

xgam1 <- rgamma(rep, shape = bhralpha, scale = bhrbeta)
xgam2 <- rgamma(rep, shape = bhralpha, scale = bhrbeta)

bhrsim <- xgam1 / (xgam1 + xgam2)

mean(xgam1)
#[1] 3.698261

mean(xgam2)
#[1] 4.427575

mean(bhrsim)
#[1] 0.4477643

R.bhrsim <- c(0.34655843, 0.71945276, 0.20861699, 0.30611308, 0.55605485,
              0.55064680, 0.69470936, 0.45576462, 0.30185399, 0.10438540,
              0.48265790, 0.75655704, 0.42791822, 0.26072882, 0.45523966,
              0.45975115, 0.21874035, 0.22447877, 0.34634855, 0.38155777,
              0.11296304, 0.48800715, 0.37407339, 0.50943619, 0.71306467,
              0.85602900, 0.88723711, 0.05755638, 0.63186913, 0.13553985,
              0.67087093, 0.26015097, 0.26414524, 0.14128396, 0.71854283,
              0.77900243, 0.53769594, 0.46026569, 0.09196446, 0.67574945,
              0.24611917, 0.54923784, 0.60613130, 0.67733881, 0.22317935,
              0.64392641, 0.43528504, 0.44700128, 0.55522003, 0.38119564)

Upvotes: 2

Views: 138

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226961

According to SAS documentation for call rangam(), call rangam(seed, a, x) generates a Gamma random deviate with shape equal to a, using random seed seed, and assigns the result to x (and updates the seed internally — calling call rangam(...) a second time with the same seed argument uses a new value from the RNG stream (the idiom seems very strange to me but what do I know?). So the R equivalent is set.seed(seed) (once, at the beginning of the loop); x <- rgamma(1, a) (R allows you to set the rate or scale != 1 by specifying additional arguments; the first argument is the number of deviates requested.)

So I believe the R equivalent would be

set.seed(111111)
xgam1 <- rgamma(50, shape = 2)
xgam2 <- rgamma(50, shape = 2)
bhrsim <- xgam1 / (xgam1 + xgam2)
mean(bhrsim)
## [1] 0.5579772

For what it's worth, this is a standard algorithm for generating Beta(alpha, beta) - distributed random numbers (although not, apparently, identical to whatever R uses internally ...)

set.seed(111111)
bhrsim <- rbeta(50, shape1 = 2, shape2 = 2)
mean(bhrsim)
## [1] 0.5324306

Upvotes: 2

Related Questions