Lesson 46 – How convoluted can it be?

After last week’s conversation with Devine about Gamma distribution, an inspired Joe wanted to derive the probability density function of the Gamma distribution from the exponential distribution using the idea of convolution.

But first, he has to understand convolution. So he called upon Devine for his usual dialog.

J: Hello D, I can wait no longer, nor can I move on to a different topic when this idea of convolution is not clear to me. I feel anxious to know at least the basics that relate to our lesson last week.

D: It is a good anxiety to have. Will keep you focused on the mission. Where do we start?

J: We are having a form of dialog since the time we met. Why don’t you provide the underlying reasoning, and I will knit the weave from there.

D: Sounds good to me. Let me start by reminding you of our conversation in Lesson 23 about probability distribution. You introduced me to the Chicago dice game where you throw a pair of dice to score the numbers 2 – 12 in the order of the rounds.

J: Yes, I remember.

D: Let’s assume that Z is that outcome which is the sum of the numbers on each dice, say X and Y.

 Z = X + Y

Create a table of these outcomes and what combinations can give you those outcomes.

J: We did this too during lesson 23. Here is the table.

D: Now, take any one outcome for Z, let’s say Z = 3, and find out the probability that the random variable Z takes a value of 3, i.e., how do you compute P(Z = 3)?

J: There are two ways of getting a number 3, when X = 1 and Y = 2, or when X = 2 and Y = 1. The total combinations are 36, so P(Z = 3) = 2/36.

D: Excellent. Now let me walk you through another way of thinking. You said the there are two ways of getting a number 3.

X = 1 and Y = 2
or
X = 2 and Y = 1

What is the probability of the first combination?

J: P(X = 1 and Y = 2) = P(X = 1).P(Y = 2) since X and Y are independent.

D: What is the probability of the second combination?

J: P(X = 2 and Y = 1) = P(X = 2).P(Y = 1), again since X and Y are independent.

D: What is the probability of Z = 3 based on these combinations?

J: Ah, I see. Since either of these combinations can occur to get an outcome 3, P(Z = 3) is the union of these combinations.

P(Z = 3) = P(X = 1).P(YΒ = 2) + P(X = 2).P(Y = 1) = 2/36

D: Yes. If you represent these as their probability mass functions, you get

 f(z) = \sum_{all-possible-combinations} f(x)f(y)

Let me generalize it to any function of X and Y so that it can help in your derivations later.

We are attempting to determine f(z), the distribution function of Z. P(Z = z). If X = x, then for the summation Z = X + Y to be true, Y = zx.

This corollary means we can find out f(z) over all possible values of x as,

 P(Z = z) = \sum_{-\infty}^{\infty} P(X = x)P(Y=z-x)

 f(z) = \sum_{-\infty}^{\infty} f_{X}(x)f_{Y}(z-x)

This property is called the convolution of  f_{X}(x) and  f_{Y}(y).

J: Then, I suppose for the continuous distribution case it will be analogous. The summation will become an integration.

D: Yes. If X and Y are two independent random variables with probability density functions  f_{X}(x) and  f_{Y}(y), their sum Z = X + Y is a random variable with a probability density function f_{Z}(z) that is the convoluton of  f_{X}(x) and  f_{Y}(y).

 f(z) = f_{X}*f_{Y}(z) = \int_{-\infty}^{\infty}f_{X}(x)f_{Y}(z-x)dx

The density of the sum of two independent random variables is the convolution of their densities.

The exact mathematical proof can also be derived, but maybe we leave that to a later conversation.

J: Understood. But like all basics, we saw this for two random variables. How then, can we extend this to the sum of n random variables. I am beginning to make connections to the Gamma distribution case that has the sum of n exponential random variables.

D: That is a good point. Now let’s suppose  S_{n} = X_{1} + X_{2} + ... + X_{n} is the sum of n independent random variables. We can always rewrite this as S_{n} = S_{n-1} + X_{n} and find the probabilty distribution function of S_{n} through induction.

J: Got it. It seems to follow the logic. Now, let me use this reasoning and walk through the derivation of the Gamma distribution.

D: Go for it. The floor is yours.

J: I will start with the two variable case. Our second meeting happened at lesson 9, and the time to the second arrival from the origin is  T_{2} = t_{1} + t_{2} = 9.

The random variable  T_{2} is the sum of two random variables  t_{1} and  t_{2} . I want to determine the probability density function of  T_{2} . I will apply the convolution theory. For consistency with today’s notations, let me take  T_{2} = Z ,  t_{1} = X , and  t_{2} = Y .

f_{Z}(z) = f_{X}*f_{Y}(z) = \int_{-\infty}^{\infty}f_{X}(x)f_{Y}(z-x)dx

f_{Z}(z) = f_{X}*f_{Y}(z) = \int_{0}^{z}\lambda e^{-\lambda x}\lambda e^{-\lambda (z-x)} dx

Both X and Y are bounded at 0, and if this is true,  y \ge 0 implies  x \le z, and  x \ge 0 implies  y \le z. Either way, the limits of the integral are from 0 to z.

f_{Z}(z) = f_{X}*f_{Y}(z) = \int_{0}^{z}\lambda^{2} e^{-\lambda x} \frac{e^{-\lambda z}}{e^{-\lambda x}} dx

f_{Z}(z) = f_{X}*f_{Y}(z) = \lambda^{2} e^{-\lambda z} \int_{0}^{z} dx

f_{Z}(z) = f_{X}*f_{Y}(z) = \lambda^{2} z e^{-\lambda z}

D: Excellent. Let me show how this function looks.

Do you see how the Gamma distribution is evolving out of exponential distribution?

J: Yes. Very clear.

D: Continue with your derivation.

J: For a three variable case, I will assume P = X + Y + S. I can also write this as P = Z + S analogous to  S_{n} = S_{n-1}+X_{n} .

Then I have the distribution function of P as

f_{P}(p) = f_{Z}*f_{S}(p) = \int_{0}^{p}\lambda^{2} z e^{-\lambda z} \lambda e^{-\lambda (p-z)} dz

f_{P}(p) = f_{Z}*f_{S}(p) = \lambda^{3} \int_{0}^{p}z e^{-\lambda z}\frac{e^{-\lambda p}}{e^{\lambda z}} dz

f_{P}(p) = f_{Z}*f_{S}(p) = \lambda^{3} e^{-\lambda p} \int_{0}^{p} z dz

f_{P}(p) = f_{Z}*f_{S}(p) = \lambda^{3} e^{-\lambda p} \frac{p^{2}}{2}

We can rewrite this as

f_{P}(p) = f_{Z}*f_{S}(p) = \frac{\lambda e^{-\lambda p} (\lambda p)^{(3-1)}}{(3-1)!}

so that the general Gamma distribution function for n variables becomes,

 f(t) = \frac{\lambda e^{-\lambda t}(\lambda t)^{r-1}}{(r-1)!}

D: Joe that is some well thought out derivation. You are really into this data analysis stuff now.

J: 😎 😎 Do we have anything else to cover today?

D: Using the same logic, can you derive the distribution for the sum of normals?

J: Normals? πŸ˜• πŸ˜• πŸ˜•

 

Oops, I think that is for the next week.
Don’t you have to get ready for the new year parties? It may be the coldest New Year’s eve on record. So you better bundle up!

Happy New Year.

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineniΒ for updates on new lessons.

error

Enjoy this blog? Please spread the word :)