Levent Bekdemir
Levent Bekdemir

Reputation: 67

Convolution of two dependent distributions in MATLAB

Assume that I have two discrete random variables X and Y.

X = {1,3,3,5,7,7,7,9,9,9,9,9}

and

Y = {5,5,9,9,10,12,13}

Where their empirical CDFs are given as:

F_x(1) = 0.0833, F_x(3) = 0.25, F_x(5) = 0.33, F_x(7) = 0.5833 and F_x(9) = 1

and

F_y(5) = 0.2857, F_y(9) = 0.5714, F_y(10) = 0.7143, F_y(12) = 0.8571 and F_y(13) = 1 

Assuming their joint distribution is

H(x,y) = F_x(x) * F_y(y)

which is actually the "assumption" of X and Y are independent.

How can i calculate the Z = X + Y and F(z) in MATLAB ?

Note: I gave the H(x,y) as a simple product function for the simplicity, but it can be anything in reality which actually models the dependency between X and Y.

Upvotes: 1

Views: 654

Answers (1)

Cris Luengo
Cris Luengo

Reputation: 60564

Given continuous probability density functions FX and FY, with a joint probability density function FX,Y, we can compute FX+Y as the integral of FX,Y over the line z=x+y. If the probability density functions are discrete, the integral above should be written as the derivative of the integral over the part of the plane given by z<=x+y.

This is fairly simple to do in MATLAB. Let's start with OP's data:

F_x = [0.0833,0.0833,0.25,0.25,0.33,0.33,0.5833,0.5833,1]; % CDF
F_x = diff([0,F_x]); % PDF
F_y = [0,0,0,0,0.2857,0.2857,0.2857,0.2857,0.5714,0.7143,0.7143,0.8571,1]; % CDF
F_y = diff([0,F_y]); % PDF
H = F_x.' .* F_y; % example joint PDF

Now we sum F_cum(z) = sum(H(x,y)) for all values z<=x+y, and then take the derivative F = diff([0,F_cum]):

[m,n] = size(H);
F_cum = zeros(1,m+n-1);
for z = 1:numel(F_cum)
   s = 0;
   for x = 1:numel(F_x)
      y = z-x+1;
      y = max(min(y,n),1); % avoid out of bounds indexing
      s = s + sum(H(x,1:y));
   end
   F_cum(z) = s;
end
F = diff([0,F_cum]);

Note that we defined y=z-x+1, meaning z=y+x-1. Thus F(1) corresponds to z=2. This is the lowest possible value that can come out of the sum of the two distributions, which we defined to start at 1.

The above can be simplified by padding H with zeros and shifting each row by one additional element. This lines up the line z=x+y on a column of the matrix, allowing us to use a trivial sum projection:

H = [H,zeros(m)];
for ii=2:m
   H(ii,:) = circshift(H(ii,:),ii-1);
end
F_cum = cumsum(sum(H,1));
F_cum = F_cum(1:end-1); % last element we don't need
F2 = diff([0,F_cum]);

But because diff([0,cumsum(F)]) == F (up to numerical precision), we can skip those two operations:

F3 = sum(H,1);
F3 = F3(1:end-1); % last element we don't need

(all(abs(F-F2)<1e-15) and all(abs(F-F3)<1e-16))

Upvotes: 1

Related Questions