Pulkit Srivastava
Pulkit Srivastava

Reputation: 11

how to generate data from a distribution whose cdf is not in closed form?

I am working on a distribution whose PDF and CDF is

enter image description here

clearly, CDF is not in neat form. So how do I generate data from this CDF in R as the inverse CDF method can't be applied here? I am using the following R program to generate a sample from this dist but I am not able to justify every step. Here is my code:

alpha=1.5; beta=3.2;

pdf=function(x)((beta/alpha)*((x/alpha)^beta)*sin(pi/beta))/(((1+(x/alpha)^beta)^2)*(pi/beta))

rand_smplfunc <- function(func, lower, upper){
  x_values <- seq(lower,upper,by = 10^-3)
  sample(x = x_values, size = 1, replace = TRUE,prob = func(x_values))
}

replicate(10,rand_smplfunc(pdf,0,10))

Upvotes: 1

Views: 696

Answers (2)

Sandipan Dey
Sandipan Dey

Reputation: 23101

You can do it the following way, using the inverse cdf method with the empirical cdf (don't need the analytically derived cdf):

alpha = 1.5
beta = 3.2

pdf = function(x)((beta/alpha)*((x/alpha)^beta)*sin(pi/beta))/(((1+(x/alpha)^beta)^2)*(pi/beta))
cdf = function(x) cumsum(pdf(x)) / sum(pdf(x)) # approximate the cdf

x <- seq(0.1, 50, 0.1)
y <- pdf(x)

plot(x, pdf(x), type='l', main='pdf', ylab='f(x)') # plot the pdf

enter image description here

plot(x, cdf(x), type='l', main='cdf', ylab='F(x)') # plot the cdf

enter image description here

# draw n samples by using probability integral transform from unifrom random 
# samples by computing the inverse cdf approximately
n <- 5000
random_samples <- approx(x = cdf(x), y = x,  runif(n))$y

# plot histogram of the samples
hist(random_samples, 200)

enter image description here From the above plot, you can see that the distribution of the generated samples indeed resembles the original pdf.

Upvotes: 1

SteveM
SteveM

Reputation: 2301

You could use the empirical cdf function ecdf as an approximation. E.g.

set.seed(7)
pdf <- rnorm(10000)
cdf <- ecdf(pdf)
summary(cdf)
Empirical CDF:    10000 unique values with summary
Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-3.930902 -0.671235 -0.001501  0.001928  0.669820  3.791824
range <- -3:3
cdf(range)
[1] 0.0013 0.0219 0.1601 0.5007 0.8420 0.9748 0.9988

So similarly generate a sufficient number of realizations of your pdf and then use the ecdf function to approximate your cdf.

Note that you can also plot your empirical cdf:plot(cdf)

Upvotes: 0

Related Questions