JES0
JES0

Reputation: 43

How to do generate some scenarios using a predicted cdf in Matlab or Python?

I have used Matlab but I also welcome python for the solution.

I have a predicted CDF (i.e., CDF^) of a random variable Var and would like to generate N scenarios using this predicted CDF (CDF^). Here is what I have done. I would like to know if this method makes sense and also how can I automatically generate N scenarios in step 3.

1) I fit an assumed cumulative distribution function(let's say Weibull) using the MLE on CDF^ and obtained the corresponding parameters of the fitted function.

2) Using these parameters, I have plotted the pdf of the assumed distribution.

3) In this step, I am not sure what to do and how! Basically I guess, I should discretize var and find the corresponding probability of each segment by calculating the area of each rectangle.

4) How can I plot my original data (var) in PMF form since it is already in CDF form?!

var= [ 0.001    0.01    97  145 150 189 202 183 248 305 492 607 1013];
cdf_prob = [0.01, 0.05, 0.15, 0.25, 0.35, 0.45, 0.50, 0.55, 0.65, 0.75, 0.85, 0.95, 0.99];
                             % cumulative prob.
a= mle(var, 'distribution', 'wbl');              
plot(var, cdf_prob, 'o-')                         % my data
hold on
xgrid = linspace (0, 1.1*max(var));
plot (xgrid, wblcdf(xgrid,a(1),a(2)));            % fitted cdf

figure(2)                                         % fitted PDF
pd= makedist('wbl', 'a', a(1),'b', a(2));
y=pdf(pd, xgrid);
plot(xgrid,y)

Step 3:
Step 3

Upvotes: 2

Views: 470

Answers (1)

SecretAgentMan
SecretAgentMan

Reputation: 2854

Generating Samples:
You can generate samples from a distribution many, many ways. If you already know you are going to use a specific distribution, like the Weibull distribution, then two easy options are:

  1. Use makedist() and random(),[1] or
  2. Use wblrnd().

Both require use of the Statistics toolbox. Toolbox-free approaches are also possible. Recommend avoiding naming a variable var as it masks the var() function.

% MATLAB R2019a
a = [209.2863 0.5054];        % a = mle(var, 'distribution', 'wbl');  % from OP code
NumSamples = 500;
pd = makedist('Weibull',a(1),a(2))    

% Method 1
X = random(pd,NumSamples,1);

% Method 2
X2 = wblrnd(a(1),a(2),NumSamples,1);

Plotting the original data:
If the data is assumed from a continuous distribution, such as the Weibull distribution, then one should use a probability density function (PDF) to visually show relative chance rather than a discrete probability mass function (PMF). PMFs only apply to discrete variables. Note that cumulative distribution functions (CDFs) apply for both continuous and discrete random variables.

This can be done with the 'Normalization','pdf' name-value pair in the histogram() properties. To achieve better results, it is often advisable to adjust the number of histogram bins (in the properties), but with only 13 data points, this is of limited value.

h = histogram(var,'Normalization','pdf')
h.NumBins = 13;

You can also overlay the fitted distribution against the empirical data.

figure, hold on
h = histogram(var,'Normalization','pdf','DisplayName','Data');
xLimits = xlim;
Xrng = 0:.01:xLimits(2);
plot(Xrng,pdf(pd,Xrng),'r--','DisplayName','Fit')
xlabel('Var')
ylabel('Probability Density Function (PDF)')
legend('show')

% Adjust these manually
ylim([0 0.02])
h.NumBins = 13;

Alternatives:[2]

You can use fitdist() which can fit a kernel density and still permits using all the functions for Probability distribution objects, including random() and pdf().

Notice I've truncated the distribution since the Weibull has a support on [0, inf].

pd2 = fitdist(X,'Kernel')
pd2t = truncate(pd2,0,inf)

Then plotting is still relatively easy and similar to the previous example.

figure, hold on
h = histogram(var,'Normalization','pdf','DisplayName','Data');
xLimits = xlim;
Xrng = 0:.01:xLimits(2);
plot(Xrng,pdf(pd2t,Xrng),'r--','DisplayName','Fit')
xlabel('Var')
ylabel('Probability Density Function (PDF)')
legend('show')
h.NumBins = 13;

A remaining alternative is to exploit the ksdensity() to get the plot.


[1] Generating samples from Weibull distribution in MATLAB
[2] Related: https://stackoverflow.com/a/56759220/8239061

Upvotes: 2

Related Questions