johnmajimboni
johnmajimboni

Reputation: 3

Simulate left Truncated Gamma distribution with mean = 1 and variance =0.4

Suppose X~Γ(α, β), I would like to truncate all values of X

MATLAB codes:

t = 0.5;  theta = 0.4; 
syms alpha beta 
EX = beta*( igamma(alpha+1,t/beta) /  igamma(alpha,t/beta) ); %Mean  
EX2 = beta^2*( igamma(alpha+2,t/beta) /  igamma(alpha,t/beta) );%Second moment         
VarX = EX2 -EX^2; %Variance       
cond1 = alpha > 0; cond2 = beta > 0; cond3 = EX==1; cond4 = VarX ==theta;
conds =[cond1 cond2 cond3 cond4]; vars = [alpha, beta];
sol = solve(conds, [alpha beta], 'ReturnConditions',true);
soln_alpha = vpa(sol.alpha)
soln_beta = vpa(sol.beta)

The above code returns a numeric answer only if the constraint that α>0 is relaxed. The numeric answer has a negative value of α which is wrong since both α (shape parameter) and β (scale parameter) must be strictly positive.

Upvotes: 0

Views: 316

Answers (1)

SecretAgentMan
SecretAgentMan

Reputation: 2854

Based on your title, I take it you want to generate samples from a Gamma distribution with mean = 1 and variance = 0.4 but want the distribution truncated to [0, inf].

If X ~ Gamma(alpha,beta), then by definition it must be nonnegative (see Gamma Distribution wiki, or MATLAB page). Indeed, both shape and scale parameters are also nonnegative. Note: MATLAB uses the (k,theta) parameterization found on the wiki page.

MATLAB has implemented probability distribution objects which make a lot of things very convenient from a practitioner perspective (or anyone who uses numerical approaches).

alpha = 0.4;
beta = 0.5;
pd = makedist('Gamma',alpha,beta)   % Define the distribution object    

Generating samples is now very easy.

n = 1000;                           % Number of samples
X = random(pd,n,1);                 % Random samples of X ~ Gamma(alpha,beta) 

All that is left is to identify the shape and scale parameters that such that E[X] = 1 and Var(X) = 0.4.

You need to solve

alpha * beta = E[X],
alpha * (beta^2) = Var(X),

for alpha and beta. It is a system of two nonlinear equations with two unknowns.

However, truncation makes these obsolete but numerical approaches will work fine.

LB = 0.5;                           % lower bound    (X > LB)
UB = inf;                           % upper bound    (X < UB)
pdt = truncate(pd,LB,UB)            % Define truncated distribution object
Xt = random(pd,n,1); 

pdt =
GammaDistribution

Gamma distribution
a = 0.4
b = 0.5
Truncated to the interval [0.5, Inf]

Fortunately, the mean and variance of a distribution object are accessible whether or not it is truncated.

mean(pdt)     % compare to mean(pd)
var(pdt)      % compare to var(pd)

You can numerically solve this problem to obtain your parameters with something like fmincon.

tgtmean = 1;
tgtvar = 0.4;
fh_mean =@(p) mean(truncate(makedist('Gamma',p(1),p(2)),LB,UB));
fh_var =@(p) var(truncate(makedist('Gamma',p(1),p(2)),LB,UB));
fh =@(p) (fh_mean(p)-tgtmean).^2 + (fh_var(p)-tgtvar).^2;
[p, fval] = fmincon(fh,[alpha;beta],[],[],[],[],0,inf)

You can test the answer for validation:

pd_test = truncate(makedist('Gamma',p(1),p(2)),LB,UB);
mean(pd_test)
var(pd_test)

ans = 1.0377
ans = 0.3758

Note this seems ill-conditioned due to the desired truncation and target mean. This might be good enough depending on your application.

histogram(random(pd_test,n,1))   % Visually inspect distribution

Histogram of Gamma dist truncated to [0.5, inf]

Mean and variance combinations must be feasible under the base distribution (here, Gamma distribution), but if truncating, that further restricts the set of feasible parameters. For example, it would be impossible to truncate X~Gamma() to the interval [5, 500] and seek to get a mean of 2 or a mean of 600.


MATLAB code verified with version R2017a.

Also note that solutions from nonlinear solvers like fmincon may be sensitive to the initial starting point for some problems. If that numerical approach is giving issues, it may be a feasibility issue (as alluded to above) or it may require using multiple start points and multiple fmincon calls to get multiple answers, then use the best one.

Upvotes: 1

Related Questions