Appendix

Appendix

A: Generalized Extreme Value distribution

Let X1,...,Xn be a sequence of independent identically distributed (i.i.d.) random variables with distribution function, F. Then let Mn = max{X1,...,Xn}. For known F, the distribution of Mn can be derived exactly for all values of n because Pr{Mn u} = Pr{Xi u; for all i=1,...,n}, which by the fact that the Xi are independent is equivalent to Pr{X1 u}. Pr{X2 u}... Pr{Xn u} and because the Xi are identically distributed this is equivalent to (Pr{X1 u})n. Thus, Pr{Mn u}=(F(u))n. Note, however, that the independence assumption, which virtually never occurs for weather and climate variables, can be relaxed (see Dependence Issues).

The problem with the above exact distribution is that F is not generally known in practice and subsequently must be estimated. However, small discrepancies between F and its estimate, say , can lead to large discrepancies between Fn and n. A widely accepted alternative is to accept F as unknown and look for approximate models for Fn that can be estimated on the basis of the extreme data alone.

Of course, as n increases Fn quickly approaches zero due to the fact that F is a distribution function and therefore yields values only between zero and one. That is, Fn 0 as n . Thus, in order to achieve a nondegenerate distribution function it is necessary to find sequences of constants {an 0} and {bn} such that Fn((Mn-bn)/an) leads to a nondegenerate distribution as n . Specifically, we seek {an 0} and {bn} such that Fn((Mn-bn)/an}) G(z) where G(z) does not depend on n.

For example, suppose F(x)=1-e-x (exponential distribution). Then, Pr{(Mn-bn)/an} u} = Pr{Mn bn + an u} = Fn(bn + an u). Letting an = 1 and bn = log n yields the following.

Fn(log n+u)=[1-exp{-log n+u}]n=[1-1/n e-u]n exp{-exp-u} as n,

which is a distribution known as the Gumbel distribution.

In fact, the Gumbel is one of three possible types of distributions to which Fn can converge. The three types are:

I. Gumbel



II. Fr\'{e}chet



III. Weibull



for parameters 0, and 0. The above three families of distributions can be combined into one family of distributions known as the generalized extreme value (GEV) family. Namely,



where {z:1+(z-)/ 0}, - , and 0. Please refer to Coles (2001) (b) for more information on the GEV family.

Back to Top

B: Threshold Exceedances

Modeling only block maxima is wasteful if other other data on extremes are available Coles (2001) (b). Let X1,X2,... be a sequence of independent and identically distributed (i.i.d.) random variables with distribution function F. Now, for some threshold, u, it follows that



If F is known, then so is the above probability. However, this is often not the case in practical applications and so approximations that are acceptable for high values of the threshold are sought--similar to using the GEV distributions for block maxima.

The generalized Pareto distribution (GPD) arises in the peaks over threshold (POT)/point process (PP) approach.

Back to Top

Generalized Pareto Distribution

Again, letting X1,X2,... be a sequence of i.i.d. random variables with common distribution function, F, and let Mn = max{X1,...,Xn}. Now, assuming F satisfies certain conditions (see Coles (2001) (b) for more information) then we have that Pr{Mn z} G(z), where



for some , 0 and . Then for a large enough threshold, u, the distribution function of (X-u), conditional on X u, is approximately



defined on {y:y 0 and (1+ y/) 0}, where = +(u-). H(y) is referred to as the generalized Pareto distribution (GPD). Again, see Coles (2001) (b) for more information on the generalized Pareto distribution.

Back to Top

Peaks Over Threshold (POT)/Point Process (PP) Approach

The point process approach to the threshold excesses problem provides an interpretation of extreme value behavior that unifies all of the other models. Additionally, it leads directly to a likelihood that enables a more natural formulation of non-stationarity in threshold excesses than can be obtained from the generalized Pareto model (Coles (2001) (b). That is, in this approach, the times at which high threshold exceedances occur and the excess values over the threshold are combined into one process based on a two-dimensional plot of exceedance times and exceedance values. The asymptotic theory of threshold exceedances shows that under suitable normalization, this process behaves like a nonhomogeneous Poisson process (Smith (2002). For more information on this approach, see Coles (2001) (b), Smith (1989) and Smith (2002).

Back to Top

Selecting a Threshold

Selecting an appropriate threshold is a critical problem with the POT methods. Too low a threshold is likely to violate the asymptotic basis of the model; leading to bias; and too high a threshold will generate too few excesses; leading to high variance. The idea is to pick as low a threshold as possible subject to the limit model providing a reasonable approximation. Two methods are available for this: the first method is an exploratory technique carried out prior to model estimation and the second method is an assessment of the stability of parameter estimates based on the fitting of models across a range of different thresholds (Coles (2001) (b).

Suppose the raw data consist of a sequence of i.i.d. measurements x1,...,xn and let x(1),...,x(k) represent the subset of data points that exceed a particular threshold, u. Define threshold excesses by yj = x(j)-u for j=1,...,k. The first method requires plotting the points



The resulting plot is called the mean residual life plot in engineering and the mean excess function in the extremes community.

Back to Top

Poisson-GP Model

The parameters of the point process model can be expressed in terms of those of the GEV distribution or, equivalently through transformations specified below, in terms of the parameters of a Poisson process and the GPD (i.e., a Poisson-GP model). Specifically, given , and from the point process model, we have the following equations.

 (B.1)
 (B.2)


where is the Poisson rate parameter, * is the scale parameter of the GP and the scale of the point process model. Eqs. (B.1) and (B.2) can be solved simultaneously for and to obtain the parameters of the associated point process model (Katz et al (2002)). Specifically, solving Eqs. (B.1) and (B.2) for and gives the following.

 (B.3)
 (B.4)


The block maxima and POT approaches can involve a difference in time scales, h. For example, if observations are daily (h 1/365) and annual maxima are modelled, then it is possible to convert the parameters of the GEV distribution for time scale h to the corresponding GEV parameters for time scale h' (see Katz et al. (2005)) by converting the rate parameter, , to reflect the new time scale.





Back to Top

C: Dependence Issues

The asymptotic distribution approximation of maximums and exceedances over a threshold assumes that data are independent and identically distributed (iid), but this is often not the case with real data. Nevertheless, the results can still be used. There are a few different methods for dealing with this problem. One is to decluster the data so that cluster maxima are independent. Another is to incorporate the dependence into a trend. For some data, the results are still valid even without declustering or incorporating a trend.

When data are independent and identically distributed we have that Pr{Mn un} = F(un)n, but if there is dependence, then we still have that Pr{Mn un} = F(un) n, where in the interval (0,1] is called the extremal index (see O'Brien (1987)). If the data are independent, then = 1; but the converse is not true (see, for example, pg. 97 of Coles (2001) (b)). Similarly, as 0 the data are said to be perfectly dependent.

Ferro and Segers (2003) present several estimates for the extremal index. The one they suggest as the ``best" (and is used by this toolkit) is defined as

 (C.1)


where Ti are the interexceedance times (the length between exceedances).

Back to Top

Probability and Quantile Plots for Non-stationary Sequences

For non-stationary time series, it is possible to incorporate a trend (or covariate) into the parameters of the GEV, GPD or Point Process models. Subsequently, each time point (or covariate value) has a different distribution associated with it. In order to plot model diagnostics, therefore, it is necessary to transform the data in such a way that each point has the same distribution. This can be accomplished in the following ways.

In the case of the GEV distribution, if we have Zt distributed as a GEV((t), (t), (t)), then the standardized variables,

 (C.2)


each have the standard Gumbel distribution with probability distribution function

 (C.3)


Probability and quantile plots can be made with (
C.3) as the reference distribution (Coles (2001) (b)). Let denote the ordered values of the transformed variables from (C.2), the probability plot consists of the pairs




and the quantile plot consists of




For the GPD, if we have that Yt is distributed as a GP((t),(t)), where t=1,...,k (k threshold excesses) then the transformation

 (C.4)


follows the same standard exponential distribution for each of the k excesses over the threshold, u(t) (u(t) may vary with time) (
Coles (2001) (b)). In this case, the probability plot is formed by the pairs of points



and the quantile plot is formed by



Finally, for the point process model, the transformation

 (C.5)


is employed and the probability plot consists of the pairs



and the quantile plot consists of the pairs






Back to Top

Back to Table of Contents