lee
lee

Reputation: 23

How to generate target number of samples from a distribution under a rejection criterion

I'm trying rnbinom as below

x<- rnbinom(500, mu = 4, size = .1)
xtrunc <- x[x>0]

then I just get 125 observations.

However, I want to make 500 observations excluding 0 (zero) with same condition (mu = 4, size =.1).

Upvotes: 1

Views: 236

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73265

This does the job:

N <- 500    ## target number of samples

## set seed for reproducibility
set.seed(0)
## first try
x <- rnbinom(N, mu = 4, size = .1)
p_accept <- mean(success <- x > 0)  ## estimated probability of accepting samples
xtrunc <- x[success]
## estimated further sampling times
n_further <- (N - length(xtrunc)) / p_accept
## further sampling
alpha <- 1.5   ## inflation factor
x_further <- rnbinom(round(alpha * n_further), mu = 4, size = .1)
## filter and combine
xtrunc <- c(xtrunc, (x_further[x_further > 0])[seq_len(N - length(xtrunc))])

## checking
length(xtrunc)
# [1] 500

summary(xtrunc)
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#   1.00    2.00    5.00   12.99   16.00  131.00 

In above, sampling takes two stages. The result of the initial stage is used to estimate probability of acceptance rate to guide the second stage sampling.

However, since the underlying distribution is explicitly known, the theoretical probability of acceptance rate is know. Therefore, there is no need to perform a two-stage approach in this case. Try:

p <- 1 - pnbinom(0, mu = 4, size = .1)  ## theoretical probability
alpha <- 1.5
n_try <- round(alpha * N / p)
set.seed(0)
x <- rnbinom(n_try, mu = 4, size = .1)
xtrunc <- (x[x > 0])[1:N]

## checking
length(xtrunc)
# [1] 500

summary(xtrunc)
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#   1.00    2.00    5.00   12.99   16.00  131.00

The idea behind is the theory of geometric distribution. My answer here is closely related. Read the "More efficient vectorized method" section for detailed explanation.

Upvotes: 1

Related Questions