Appendix
Appendix
A: Generalized Extreme Value distribution
Let X1,...,Xn be a sequence of independent
identically distributed (i.i.d.) random variables
with distribution function, F. Then let Mn =
max{X1,...,Xn}. For known F, the
distribution of Mn can be derived exactly for all values of n because
Pr{Mn
u} =
Pr{Xi
u; for all i=1,...,n}, which by the fact that the
Xi are independent is equivalent to
Pr{X1
u}.
Pr{X2
u}...
Pr{Xn
u} and because the Xi are
identically distributed this is equivalent to
(Pr{X1
u})n. Thus,
Pr{Mn
u}=(F(u))n. Note, however, that
the independence assumption, which virtually never occurs for weather and climate variables, can be relaxed (see
Dependence Issues).
The problem with the above exact distribution is that F is not generally known in practice and
subsequently must be estimated. However, small discrepancies between F and its estimate, say
,
can lead to large discrepancies between Fn and
n. A widely accepted alternative is to accept F as
unknown and look for approximate models for Fn that can be estimated on the basis of
the extreme data alone.
Of course, as n increases Fn quickly approaches zero due to the fact that
F is a distribution function and therefore yields values only between zero and one. That is,
Fn
0 as
n
. Thus, in order to achieve a nondegenerate distribution
function it is necessary to find sequences of constants
{an
0} and {bn} such that
Fn((Mn-bn)/an)
leads to a nondegenerate distribution as n
. Specifically,
we seek {an
0} and {bn}
such that
Fn((Mn-bn)/an})
G(z)
where G(z) does not depend on n.
For example, suppose F(x)=1-e-x (exponential distribution). Then,
Pr{(Mn-bn)/an}
u} =
Pr{Mn
bn + an u} =
Fn(bn + an u).
Letting an = 1 and bn =
log n yields the following.
Fn(log n+u)=[1-exp{-log n+u}]n=[1-1/n e-u]n
exp{-exp-u} as n
,
which is a distribution known as the Gumbel distribution.
In fact, the Gumbel is one of three possible types of distributions to which Fn can
converge. The three types are:
I. Gumbel
II. Fr\'{e}chet
III. Weibull
for parameters 
0,
and
0.
The above three families of distributions can be combined into one family of distributions known as the
generalized extreme value (GEV) family. Namely,
where {z:1+
(z-
)/
0},
-
,
and
0. Please refer to
Coles (2001) (b) for more information on the GEV family.
Back to Top
B: Threshold Exceedances
Modeling only block maxima is wasteful if other other data on extremes are available
Coles (2001) (b).
Let X1,X2,... be a sequence of independent and identically
distributed (i.i.d.) random variables with distribution function F. Now, for some threshold, u, it follows
that
If F is known, then so is the above probability. However, this is often not the case
in practical applications and so approximations that are acceptable for high values of the
threshold are sought--similar to using the GEV distributions for block maxima.
The generalized Pareto distribution (GPD) arises in the peaks over threshold (POT)/point process (PP)
approach.
Back to Top
Generalized Pareto Distribution
Again, letting X1,X2,... be a sequence of i.i.d. random
variables with common distribution function, F, and let Mn =
max{X1,...,Xn}. Now,
assuming F satisfies certain conditions (see Coles (2001) (b) for more
information) then we have that Pr{Mn
z}
G(z), where
for some
,
0 and
. Then for a large
enough threshold, u, the distribution function of (X-u), conditional on X
u, is
approximately
defined on {y:y
0 and (1+
y/
)
0}, where
=
+
(u-
). H(y) is referred to as the generalized
Pareto distribution (GPD). Again, see Coles (2001) (b) for more information
on the generalized Pareto distribution.
Back to Top
Peaks Over Threshold (POT)/Point Process (PP) Approach
The point process approach to the threshold excesses problem provides an interpretation of extreme
value behavior that unifies all of the other models. Additionally, it leads directly to a
likelihood that enables a more natural formulation of non-stationarity in threshold excesses than
can be obtained from the generalized Pareto model (Coles (2001) (b). That is, in
this approach, the times at which high threshold exceedances occur and the excess values over the threshold are
combined into one process based on a two-dimensional plot of exceedance times and exceedance
values. The asymptotic theory of threshold exceedances shows that under suitable normalization,
this process behaves like a nonhomogeneous Poisson process (Smith (2002).
For more information on this approach, see Coles (2001) (b),
Smith (1989) and Smith (2002).
Back to Top
Selecting a Threshold
Selecting an appropriate threshold is a critical problem with the POT methods. Too low a threshold is
likely to violate the asymptotic basis of the model; leading to bias; and too high a threshold will
generate too few excesses; leading to high variance. The idea is to pick as low a threshold as possible
subject to the limit model providing a reasonable approximation. Two methods are available for this: the
first method is an exploratory technique carried out prior to model estimation and the second method is an
assessment of the stability of parameter estimates based on the fitting of models across a range of
different thresholds (Coles (2001) (b).
Suppose the raw data consist of a sequence of i.i.d. measurements
x1,...,xn and let
x(1),...,x(k) represent the subset of data points that
exceed a particular threshold, u. Define threshold excesses by yj =
x(j)-u for j=1,...,k. The first method requires plotting the points
The resulting plot is called the mean residual life plot in engineering and the mean excess function in the
extremes community.
Back to Top
Poisson-GP Model
The parameters of the point process model can be expressed in terms of those of the GEV distribution or,
equivalently through transformations specified below, in terms of the parameters of a Poisson process and
the GPD (i.e., a Poisson-GP model). Specifically, given
,
and
from the point process model, we have the following equations.
 | (B.1) |
 | (B.2) |
where
is the Poisson rate parameter,
* is the
scale parameter of the GP and
the scale of the point process model. Eqs. (B.1)
and (B.2) can be solved simultaneously for
and
to obtain the parameters of the associated point process model
(Katz et al (2002)).
Specifically, solving Eqs. (B.1) and (B.2) for
and
gives the following.
 | (B.3) |
 | (B.4) |
The block maxima and POT approaches can involve a difference in time scales, h. For example, if observations are
daily (h
1/365) and annual maxima are modelled, then it is possible to convert the parameters
of the GEV distribution for time scale h to the corresponding GEV parameters for time scale h'
(see Katz et al. (2005)) by converting the rate parameter,
, to reflect the new time
scale.
Back to Top
C: Dependence Issues
The asymptotic distribution approximation of maximums and exceedances over a threshold assumes that
data are independent and identically distributed (iid), but this is often not the case with real
data. Nevertheless, the results can still be used. There are a few different methods for dealing
with this problem. One is to decluster the data so that cluster maxima are independent. Another is
to incorporate the dependence into a trend. For some data, the results are still valid even without
declustering or incorporating a trend.
When data are independent and identically distributed we have that
Pr{Mn
un} =
F(un)n, but if there is dependence,
then we still have that
Pr{Mn
un} =
F(un)
n, where
in the interval (0,1] is called the extremal index (see
O'Brien (1987)). If the data are independent, then
= 1;
but the converse is not true (see, for example, pg. 97 of Coles (2001) (b)).
Similarly, as 
0 the data are said to be perfectly dependent.
Ferro and Segers (2003) present several estimates for the extremal
index. The one they suggest as the ``best" (and is used by this toolkit) is
defined as
 | (C.1) |
where Ti are the interexceedance times (the length between exceedances).
Back to Top
Probability and Quantile Plots for Non-stationary Sequences
For non-stationary time series, it is possible to incorporate a trend (or covariate) into the parameters
of the GEV, GPD or Point Process models. Subsequently, each time point (or covariate value) has a different
distribution associated with it. In order to plot model diagnostics, therefore, it is necessary to
transform the data in such a way that each point has the same distribution. This can be accomplished in
the following ways.
In the case of the GEV distribution, if we have Zt distributed as a
GEV(
(t),
(t),
(t)), then the standardized variables,
 | (C.2) |
each have the standard Gumbel distribution with probability distribution function
 | (C.3) |
Probability and quantile plots can be made with (C.3) as the reference
distribution (Coles (2001) (b)). Let
denote the
ordered values of the transformed variables from (C.2), the probability plot consists of
the pairs
and the quantile plot consists of
For the GPD, if we have that Yt is distributed as a
GP(
(t),
(t)), where t=1,...,k (k threshold
excesses) then the transformation
 | (C.4) |
follows the same standard exponential distribution for each of the k excesses over the threshold, u(t)
(u(t) may vary with time) (Coles (2001) (b)).
In this case, the probability plot is formed by the pairs of points
and the quantile plot is formed by
Finally, for the point process model, the transformation
 | (C.5) |
is employed and the probability plot consists of the pairs
and the quantile plot consists of the pairs
Back to Top
Back to Table of Contents