Lesson 59 – The Generalized extreme value distribution

Recap

Two weeks ago, we met Maximus Extremus Distributus. He was taking different forms depending on the origin (parent) distribution.

Last week, Mumble, Joe, and Devine met to discuss the central ideas behind extreme value distribution. They derived the exact function for it.

F_{Y}(y)=P(Y \le y) = P(X_{1} \le y)P( X_{2} \le y) ...P(X_{n} \le y) = [F_{X}(y)]^{n}.

From this cumulative function, we can derive the probability function.

f_{Y}(y) = n[F_{X}(y)]^{n-1}f_{X}(y)

Mumble pointed out the issue of degeneration of the exact function as n \to \infty and introduced the idea of normalizing constants to stabilize the function.

If there are two normalizing constants a_{n}>0 and b_{n}, we can create a normalized version of Y as Y^{*} = \frac{Y-b_{n}}{a_{n}}.

Before we closed off for the week, we knew that Y^{*} converges to three types, Type I, Type II and Type III extreme value distributions.

If there exist normalizing constants a_{n}>0 and b_{n}, then,

P(\frac{Y - b_{n}}{a_{n}} \le z) \to G(z) as n \to \infty.

G(z) is the non-degenerate cumulative distribution function.

Type I (Gumbel Distribution): G(z) = e^{-e^{-\frac{z-\alpha}{\beta}}}. Double exponential distribution.

Type II (Frechet Distribution): G(z) = e^{-(\frac{z-\alpha}{\beta})^{-\gamma}} for z > \alpha and 0 for z \le \alpha. This is single exponential function.

Type III (Weibull Distribution): G(z) = e^{-[-\frac{z-\alpha}{\beta}]^{\gamma}} for z < \alpha and 1 for z \ge \alpha. This is also a single exponential distribution.

The convergence

Let’s get some intuition on why the parent distributions converge to these three types. The mathematical foundation is much more in-depth. We will skip that for some later lessons and get a simple intuition here.

Exponential origin: Let’s take Joe’s wait time example from last week. We assume that the arrival times between successive vehicles has an exponential distribution. Let’s call this random variable X.

X \sim Exp(\lambda)

There are n arrival times between successive vehicles that can be shown as a set of random variables.

X_{1}, X_{2}, X_{3}, ..., X_{n}

The maximum wait time is the max of these numbers. Let’s call this maximum time, Y.

Y = max(X_{1}, X_{2}, X_{3}, ..., X_{n})

The cumulative function for an exponential distribution is F_{X}(x) = 1 - e^{-\lambda x}. Hence, for Y (maximum wait time), it will be

F_{Y}(y) = [1 - e^{-\lambda y}]^{n}

For simplicity, let’s assume a value of 1 for \lambda and take the binomial series expansion for F_{Y}(y) = [1 - e^{-\lambda y}]^{n}.

F_{Y}(y) = 1 - ne^{-y} + \frac{n(n-1)}{2!}e^{-2y} - ...

As n \to \infty, this series converges to e^{-ne^{-y}}, an asymptotic double exponential functions.

Norming Constants: Now let’s derive the double exponential function with the idea of norming constants. We know F_{Y}(y) = [1 - e^{-y}]^{n} for \lambda = 1.

Let’s introduce a variable z = y + ln(n) and evaluate the function F_{Y}(y) at z. We are adding a constant ln(n) to y.

F_{Y}(y+ln(n)) = [1 - e^{-(y+ln(n))}]^{n}

F_{Y}(y+ln(n)) = [1 - \frac{1}{e^{ln(n)}}e^{-y}]^{n}

F_{Y}(y+ln(n)) = [1 - \frac{1}{n}e^{-y}]^{n}

As n \to \infty, F_{Y}(y+ln(n)) converges to e^{-e^{-y}}, a double exponential function. If you observe the equation carefully, it is of the form [1+\frac{1}{n}x]^{n}, which in the limit is e^{x}. In our case, x = -e^{-y}. Hence,

F_{Y}(y+ln(n)) =e^{-e^{-y}}

If we replace y = z - ln(n), we get,

F_{Y}(z) =e^{-e^{-(z-ln(n))}}

So, with appropriate scaling (stabilization/norming), we see a double exponential function when the origin is an exponential function.

Power law origin: Let’s try one more parent function. This time, it is a power law function with a cumulative density function F_{X}(x)=1 - x^{-a}. We did not learn this distribution, but it is like a decay function with a controlling the degree of decay. The distribution of the maximum value Y of this function is

F_{Y}(y) = [1 - y^{-a}]^{n}

We will assume a new variable z = n^{a}y and evaluate the function at z.

F_{Y}(n^{a}y) = [1 - (n^{a}y)^{-a}]^{n}

=[1 - n^{-1}y^{-a}]^{n}

=[1 + \frac{1}{n}(-y^{-a})]^{n}

In the limit, as n \to \infty, F_{Y}(n^{a}y) = e^{-y^{-a}}

Hence, F_{Y}(z)=e^{-(\frac{z}{n^{a}})^{-a}}

So, the origin functions with a power law type of functions converge to single exponential Type II Frechet distribution. Similar norming constants can be observed for other distributions that converge to Type III Weibull distribution.

To summarize, the three types are e^{-e^{-\frac{z-\alpha}{\beta}}}, e^{-(\frac{z-\alpha}{\beta})^{-\gamma}}, e^{-[-\frac{z-\alpha}{\beta}]^{\gamma}}.

They are of the form e^{-e^{-y}}, e^{-y^{-\gamma}}, e^{-(-y)^{\gamma}}.

Here’s a visual of how these three distributions look.

If the right tail is of exponential type, the extreme value distribution is a Gumbel distribution. Here the parent distribution (or the distribution of X) is unbounded on the right tail. Extremes of most common exponential type distributions such as normal, lognormal, exponential and gamma distributions converge to the double exponential Gumbel distribution. It is most commonly used to model maximum streamflow, maximum rainfall, earthquake occurrence and in some cases, maximum wind speed.

The Frechet distribution, like the Gumbel distribution, is unbounded on the right tail and is much fatter. Extremes from Pareto distribution (Power Law) and Cauchy distributions converge to Frechet Distribution. Rainfall and streamflow extremes, air pollution and economic impacts can be modeled using this type. Notice how the red line (Frechet distribution) has a heavy tail and is bigger than the black line (Gumbel distribution).

If the right tail converges to a finite endpoint, it is a Weibull distribution. The parent distribution is also bounded on the right. Remember the extremes of Uniform, which is bounded converged to a Weibull. It is most widely used in minima of the strength of materials and fatigue analysis. It is also used in modeling temperature extremes and sea level.

The Generalized Extreme Value Distribution (GEV)

The three types of extreme value distributions can be combined into a single function called the generalized extreme value distribution (GEV). Richard von Mises and Jenkinson independently showed this.

G(z) = e^{-[1+\xi(\frac{z-\mu}{\sigma})]^{-1/\xi}_{+}}

\mu is the location parameter. \sigma > 0 is the scale parameter. \xi is the shape parameter.

When \xi \to 0, GEV tends to a Gumbel distribution. It is the same form [1+\frac{1}{n}x]^{n}, which in the limit is e^{x}. For GEV, the exponential term goes to e^{-(\frac{z-\mu}{\sigma})} hence yielding the double exponential G(z) = e^{-e^{-(\frac{z-\mu}{\sigma})}}.

When \xi > 0, GEV tends to the Frechet distribution. Replace \xi = 1 and see for yourself what you get.

When \xi < 0, GEV tends to the Weibull distribution. Replace \xi = -1 and check.

\mu and \sigma are the surrogate norming variables and \xi controls the shape.

GEV folds all the three types into one form. The parameters \mu, \sigma and  \xi can be estimated from the data, hence negating the necessity to know which Type a parent distribution or data converges to. The function has a closed form solution to compute the quantiles and probabilities. GEV also has the max stable property about which we will learn in some later lessons.

All these concepts will become more concrete once we play with some data. Aren’t you itching to do some GEV in R? You know what is coming next week.

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.

One thought on “Lesson 59 – The Generalized extreme value distribution”

  1. Hello,
    Your lessons are just incredible. I have been planning to learn about EVT for a long time, but the convergence concepts are too mysterious for me. Now with relaxed and conversational explanations, EVT does not appear as mysterious. Thank you for that.

    I thought CDF of Power law distribution is (x/b)^c, 0<=x0, c >0.
    The formula you show for power law appears to be more like that of Pareto. I know power and Pareto are inverses of each other. Can you please clarify my confusion about Max of power law parent distribution leading to Frechet distribution?

    Another trouble understanding is
    EXP[-(-(x-alpha)/beta)]^gamma looks nothing like Weibull in the usual parameterization. I read somewhere that it is reversed Weibull. I don’t get it completely. Can you explain the reversed Weibull business.

    Again, excellent, excellent lessons. Amazed that so much effort is put into it. I can’t write a sentence with math notation without a typo. I did not find any typos here despite very cumbersome and difficult notation.

    Are the lessons pretty much predetermined? Can we ask for requests on a particular topic?

    Again, thanks.

    H. G.

Comments are closed.

error

Enjoy this blog? Please spread the word :)